Slides are available at:

https://slides.frowning.wtf/static-analysis

Watch the recording on Youtube!

$whoami

Erin Browning

Senior Security Engineer

@Slack

You can contact me at:

erin@frowning.wtf

@efrowning




Tim Faraci

Staff Security Engineer @Slack

faracitim@gmail.com

Linkedin - timfaraci

slack logo

Slack is hiring!

Slack is used by millions of people every day – we need engineers who want to make that experience as secure and enjoyable as possible.

slack.com/jobs

Both of us have lots of experience implementing static analysis programs.
Erin has previously implemented a static analysis
program for 1200 devs
across multiple languages for lots of different compliance standards.



Tim Faraci has implemented:
Multiple SAST Programs
Medium and Large Companies
PCI, HIPPA, FedRamp
Experience in Commercial & Open Source
Current Commercial Problem - SAST
Hours vs Minutes
200k + Implementaion
Yearly Sales Negotiations
Mystery Secret Scanning Sauce
There is a Better Way!
Open Source - Easier Implementation
Get Scan Results Fast! Keep Devs Happy!
Lots of Language Support
Leverage the Power of Open Source Community
Find Vulnerabilities!
Get Compliance Checkbox!
Check Out Semgrep!
We need:
1. A generic engine
2. Ability to tune and define the ruleset
3. Ability to build and control our own infra
It's a language agnostic static analysis engine.
It can injest a language's abstract syntax tree and a ruleset to analyse codebases.
How?
It uses parsers to create strongly typed, representative syntax trees
Download open source rules or
write your own
The rules are yaml files; they're easy to write or modify
Shout out to r2c!
We have a generic engine. How can we use it to easily scale up our static analysis program?
We're moving the work of a static analysis program to the left.
First of all, we need easy-to-maintain architecture
Enabled file
Handling false positives
We chose to use a JSON file for super easy maintenance
We remove false positives via an identifier
Hash IDs are created with this data:
  • file path
  • three lines of code before, one line after
  • the triggered rule name

We don't use the line number--it's too fragile!
Developers can create PRs to add false positives in new scans
Security can review their request
We can also do comparisons over time or by branch based on the identiers
						
def compare_to_last_run(old_output, new_output, output_filename):
	"""
	This compares two scan runs to each other.
	It only keeps findings that are exclusively in the new run.
	"""
	old = open_json(old_output)
	new = open_json(new_output)
	old_hashes = get_hash_ids(old)
	new_hashes = get_hash_ids(new)

	for new_issue_hash in new_hashes:
	    if new_issue_hash in old_hashes:
	        new["results"].remove(new_hashes[new_issue_hash])
	                
	write_json(output_filename, new)
	return new
						
						

We currently perform:
  • A daily comparison
  • A branch comparison
The daily scan compares today to yesterday for all enabled repos. It runs in Jenkins.
Pull Request Scan
Outputting Results
Alerts
Metrics
Semgrep Rules
Dev Requesting False Positive Via GitHu
Development Education
Because we own the infrastructure. We can make it faster!
How easy is it to add a language to semgrep?
An intern could do it!

Or two interns

Who are almost done with computer science degrees


David Frankel

Nicholas Lin
We've been working on:
Project SUSHI
Static analysis Using Semgrep (with) Hack Integration

Github's tree-sitter
Generic AST parser conversion
Finally, rule creation
Putting it all together

1. Enabled file

2. Empty json for false positives

3. Review the results

4. Bam! You're scanning that codebase

5. You are now in...

Thanks to:

The SNOW team

R2C

Our wonderful summer interns, Nicholas and David

Slack

The AppSec Village

Thanks to:

Antonio de Jesus Ochoa Solano

Ryan Slama

Further reading: