Free as in Beer

Slides are available at:

https://slides.frowning.wtf/static-analysis

Watch the recording on Youtube!

$whoami

Erin Browning

Senior Security Engineer

@Slack

You can contact me at:

erin@frowning.wtf

@efrowning

Tim Faraci

Staff Security Engineer @Slack

faracitim@gmail.com

Linkedin - timfaraci

Slack is hiring!

Slack is used by millions of people every day – we need engineers who want to make that experience as secure and enjoyable as possible.

slack.com/jobs

Both of us have lots of experience implementing static analysis programs.

Erin has previously implemented a static analysis
program for 1200 devs
across multiple languages for lots of different compliance standards.

Tim Faraci has implemented:

Multiple SAST Programs

Medium and Large Companies

PCI, HIPPA, FedRamp

Experience in Commercial & Open Source

Current Commercial Problem - SAST

Hours vs Minutes

200k + Implementaion

Yearly Sales Negotiations

Mystery Secret Scanning Sauce

There is a Better Way!

Open Source - Easier Implementation

Get Scan Results Fast! Keep Devs Happy!

Lots of Language Support

Leverage the Power of Open Source Community

Find Vulnerabilities!

Get Compliance Checkbox!

Check Out Semgrep!

We need:

1. A generic engine

2. Ability to tune and define the ruleset

3. Ability to build and control our own infra

It's a language agnostic static analysis engine.
It can injest a language's abstract syntax tree and a ruleset to analyse codebases.

How?

It uses parsers to create strongly typed, representative syntax trees

Download open source rules or
write your own
The rules are yaml files; they're easy to write or modify

Shout out to r2c!

We have a generic engine. How can we use it to easily scale up our static analysis program?

We're moving the work of a static analysis program to the left.

First of all, we need easy-to-maintain architecture

Enabled file

Handling false positives
We chose to use a JSON file for super easy maintenance

We remove false positives via an identifier

Hash IDs are created with this data:

file path
three lines of code before, one line after
the triggered rule name

We don't use the line number--it's too fragile!

Developers can create PRs to add false positives in new scans
Security can review their request

We can also do comparisons over time or by branch based on the identiers

						
def compare_to_last_run(old_output, new_output, output_filename):
	"""
	This compares two scan runs to each other.
	It only keeps findings that are exclusively in the new run.
	"""
	old = open_json(old_output)
	new = open_json(new_output)
	old_hashes = get_hash_ids(old)
	new_hashes = get_hash_ids(new)

	for new_issue_hash in new_hashes:
	    if new_issue_hash in old_hashes:
	        new["results"].remove(new_hashes[new_issue_hash])
	                
	write_json(output_filename, new)
	return new

We currently perform: