A SAST Primer

Publish date: Jun 5, 2021
Last updated: Jun 5, 2021

Tags:

AppSec
SAST

Hi friends!

I’m Cole and I write regularly about security and development. If you have any feedback, feel free to message me on LinkedIn! Anyway, onto the article!

Introduction

I began my security career as a graduate at an Australian government agency. Day 1 of my security rotation, a few things happened.

My manager took a 3 month overseas trip.
I didn’t really have direction, so I picked up this book and started reading it.
After a month, I was in a tiny room with no air-conditioning and beefy servers scanning big web applications.

It might not sound ideal, but the heat was a welcome reprieve from the Canberra weather. As a novocastrian, shorts, a white tee, and some thongs taught me a good lesson.

JAN	FEB	MAR	APR	MAY	JUN	JUL	AUG	SEP	OCT	NOV	DEC
13.7	13.6	10.9	7.1	3.7	1.3	0.2	1.3	3.8	6.4	9.5	11.9

Tech has changed since then. Copying code onto USB drives and doing manual code reviews on airgapped machines feels wrong. Banks and governments are adopting Git-based SCM’s, CI/CD, and actually use linux now! Crazy..

Anyway, I got my stripes from years of operating SAST products in various enterprise contexts.

Right now, I believe that commercial SAST is not a good control to invest in. This article covers why I believe that and what needs to be addressed.

SAST Basics

The OWASP Foundation defines SAST as:

Static Application Security Testing (SAST) Tools are designed to analyze source code or compiled versions of code to help find security flaws.

Some SAST tools are lightweight and just look for known bad sequences. These are things like string concatenation, eval, or keywords like password or MD5. Other tools work on compiled codebases, like FindBugs (for Java ByteCode). There are also big enterprise products that do crazy taint analysis and thorough review.

SAST tools just find bugs in code.

SAST Walkthrough

Let’s take the SAST product, Fortify. The Fortify SourceAnalyzer will build a model of the codebase you specify.

The following code for example, is converted into the IL code below. Sourceanalyzer -build isn’t too different in concept.

using System;
public class C {
    public void M() {
        string TrafficLight = "Red";
        switch (TrafficLight)
        {
        case "Red":
            Console.WriteLine("Stop");
            break;
        case "Yellow":
            Console.WriteLine("Try to Stop");
            Console.WriteLine("Please");
            break;
        default:
            Console.WriteLine("Gogogogo");
            break;
        }
    }
}

.class private auto ansi '<Module>'
{
} // end of class <Module>

.class public auto ansi beforefieldinit C
    extends [System.Private.CoreLib]System.Object
{
    // Methods
    .method public hidebysig 
        instance void M () cil managed 
    {
        // Method begins at RVA 0x2050
        // Code size 90 (0x5a)
        .maxstack 2
        .locals init (
            [0] string TrafficLight,
            [1] string,
            [2] string
        )

        IL_0000: nop
        IL_0001: ldstr "Red"
        IL_0006: stloc.0
        IL_0007: ldloc.0
        IL_0008: stloc.2
        // sequence point: hidden
        IL_0009: ldloc.2
        IL_000a: stloc.1
        // sequence point: hidden
        IL_000b: ldloc.1
        IL_000c: ldstr "Red"
        IL_0011: call bool [System.Private.CoreLib]System.String::op_Equality(string, string)
        IL_0016: brtrue.s IL_0027

        IL_0018: ldloc.1
        IL_0019: ldstr "Yellow"
        IL_001e: call bool [System.Private.CoreLib]System.String::op_Equality(string, string)
        IL_0023: brtrue.s IL_0034

        IL_0025: br.s IL_004c

        IL_0027: ldstr "Stop"
        IL_002c: call void [System.Console]System.Console::WriteLine(string)
        IL_0031: nop
        IL_0032: br.s IL_0059

        IL_0034: ldstr "Try to Stop"
        IL_0039: call void [System.Console]System.Console::WriteLine(string)
        IL_003e: nop
        IL_003f: ldstr "Please"
        IL_0044: call void [System.Console]System.Console::WriteLine(string)
        IL_0049: nop
        IL_004a: br.s IL_0059

        IL_004c: ldstr "Gogogogo"
        IL_0051: call void [System.Console]System.Console::WriteLine(string)
        IL_0056: nop
        IL_0057: br.s IL_0059

        IL_0059: ret
    } // end of method C::M

    .method public hidebysig specialname rtspecialname 
        instance void .ctor () cil managed 
    {
        // Method begins at RVA 0x20b6
        // Code size 8 (0x8)
        .maxstack 8

        IL_0000: ldarg.0
        IL_0001: call instance void [System.Private.CoreLib]System.Object::.ctor()
        IL_0006: nop
        IL_0007: ret
    } // end of method C::.ctor

} // end of class C

Fortify then uses analyzers on that model to identify findings. Here are the Fortify SCA analyzers.

Some analyzers function the same as grep, some look for memory manipulation. The data flow analyzer is the most important one though. The data flow analyzer identifies sources and sinks, and tracks data between the two. A source is a point where malicious data can enter the application. A sink is where the malicious data can execute. This is also known as Taint Analysis. Once SourceAnalyzer is finished, it will output the results into a proprietary format and potentially send them to a centralised server. Now that you understand how a typical SAST tool operates, time to talk about my concerns.

SAST Weaknesses

The main weaknesses of SAST tools are that they are:

Not built for developers
Not built for a DevOps world
Lacking coverage of modern languages and architectures
In need of significant tuning to generate value

Built for Security, not Developers.

This will scan a java project.

sourceanalyzer -b Cole -cp "lib/*.jar" "src/**/*.java" 
sourceanalyzer -b Cole -scan -f thousandVulns.fpr

The results are presented in this format.

Audit Workbench

This was designed originally for security analysts to audit code, and it is good for that. However, this product is not suitable for developers.

As an AppSec professional, you want development teams to proactively fix security findings. Developers have context and know their codebase better than you do. But developer time is just as valuable as your own. forcing developers to context-switch between their IDE and the audit product. They might not have worked on this codebase in months. They might have inherited the code even!

Keeping findings in the security system or requiring the use of a separate application will create a barrier. That’s all it takes for developers to not use the tooling or even review the results.

These tools are great if you’re looking for an assurance function. If you don’t trust your developers to write secure code, use SAST for assurance. The problem is that competent developers will become increasingly frustrated with the process, the security stage-gate, being unable to run operate the tools or having no time to fix the issues. Security tooling, in the product security space, should be Developer First.

Tuning, False Positives, and DevOps Integrations

When using a SAST tool, you’ll need to populate the CLI arguments before starting a scan. In the world these tools were built, this was fine. In a DevOps world it becomes challenging.

You can build scripts to infer your arguments? Use a * and hope? Do you trust your developers to include a .fortifyscan with correct build and tuning parameters? Maybe they should submit those parameters to the CI job via API? Getting the tool to operate effectively in an automated fashion requires testing, tuning, and patience.

cole@galahcyber: ~/Blog $ find . -type f -printf '%f\n' | sed -r -n 's/.+(\..*)$/\1/p' | sort | uniq -c | sort -bn
      4 .json
      4 .yaml
      5 .css
      8 .js
     35 .md
     37 .html

There are other aspects to consider even if you get the scan running. A scan needs to be timely and accurate for developers to see value. If your scan runs asynchronously, it won’t slow a DevOps pipeline. But nobody wants to get results on stale code. You can work on optimising the scan engines, changing RAM allocation to the JVM, switching and configuring analyzers, including or excluding references, or tuning random parameters like below. SAST still won’t be fast enough for the feedback loop that developers want.

com.fortify.sca.exclude - Reduces code coverage..
com.fortify.sca.DisableAnalyzers - Removes bugclasses..
com.fortify.sca.analyzer.controlflow.EnableTimeOut - Coverage loss..
com.fortify.sca.skip.libraries.javascript - Ignore libraries..
com.fortify.sca.limiters.MaxChainDepth - Lose depth of scanning...

Accuracy is important for SAST tools as well. SAST vendors tend to focus on providing as many findings as possible. This is overwhelming, how do you prioritise fixes? How many are genuine problems? There are criticalities, but even then a lot of the issues don’t have context. MD5 is fine for file-integrity, but not passwords. Pseudorandomness is fine for avatar colours, but not encryption. SAST tools lack the context to determine which is appropriate and flag these as High findings. No wonder developers get annoyed!

If my personal experience isn’t convincing enough, then how about Big Tech? Facebook and Google have heavily invested in SAST and funded academic research into the topic as well.

Both Facebook and Google identified asynchronous scanning, scalability issues, developer satisfaction, and finding accuracy to be major problems.

Item	Facebook	Google
Code	100 Million Lines of Hack (PHP) and 10-30 Million of C++ and Objective C	2 Billion LoC across Java, JS, Python, and Golang
Activity	Thousands of code updates per day	Similar to Facebook
Tooling	Two customer built analyzers, Zoncolan + Infer	Almost every OSS one available, but initially FindBugs
Type	Async hourly scans + PR differential scanning	SAST at Code Review

Facebooks major findings were:

The response was stunning: we were greeted by near silence [to Batch Scans]. We assigned 20–30 issues to developers, and almost none of them were acted on. We had worked hard to get the false positive rate down to what we thought was less than 20%, and yet the fix rate was near zero.

If a developer is working on one problem, and they are confronted with a report on a separate problem, then they must swap out the mental context of the first problem and swap in the second, and this can be time consuming and disruptive.

Google has problems too:

Google does not have infrastructure support to run interprocedural or whole-program analysis at Google scale, nor does it use advanced static analysis techniques (such as separation logic) at scale. Even simple checks have required analysis infrastructure supporting workflow integration to make them successful.

When an issue is discovered in the codebase, it can be nontrivial to assign it to the right person. In the extreme, somebody who has left the company might have caused the issue. Furthermore, even if you think you have found someone familiar with the codebase, the issue might not be relevant to any of their past or current work.

Other Challenges

SAST tools were written when Java and MVC were best practice. Tech is different now. We use niche programming languages, or even DSL’s. Distributed services are envisioned from the beginning. Cloud services are commonly used over on-premise infrastructure. Microservices and containerisation are increasingly being used. Don’t get me started about ‘AI’..

Keeping up with Codeashians is not cost-effective for SAST vendors. Languages are either fads or the new baseline. Investing in support for Golang might end up a terrible value proposition for the business.

Thankfully in Australia, we’re pretty comfortable living in the past… except that our greybeards are retiring and businesses are increasingly migrating away from this legacy. If you’re working for a greenfields tech company, chances are that SAST won’t support your NodeJS EDA.

Even the biggest value proposition of SAST, Taint Analysis, isn’t easily able to find paths when event publishing and listening are becoming the goto. Our services have become modernised, but the tooling not so much.

Addressing SAST Weaknesses

SAST is not a good return on investment right now. Even big tech aren’t a fan after a decade of well-funded research and SAST resourcing. Right now in Australia, we are using SAST as a stage-gate. An assurance function at the end of the SDLC. Expensive consultants are employed to review reports for false positives and throwing the results over the fence. It’s unscalable, expensive, the complete opposite of a good Application Security program.

This needs to change.

Improving Developer Confidence in Security Tooling

Google was able to improve developer confidence with their tooling.

The Tricorder team keeps careful accounting of issues fixed, performs surveys to understand developer sentiment, makes it easy to file bugs against the analysis tools, and uses all this data to justify continued investment. Developers need to build trust in analysis tools. If a tool wastes developer time with false positives and low-priority issues, developers will lose faith and ignore results.

Surveying developers about their experience, and actively changing your tuning and tooling helps. SAST tools are customisable. Turning off rules like “Insecure Randomness” or “Code Quality” will reduce noise. But you must work with developers. If you’re not communicating about improving the SAST process, they won’t feel that their feedback valuable.

Take time to speak with your engineering teams about their gripes with your tooling. Fix those gripes. Communicate the change broadly and thank the team. This builds trust and confidence.

Facebook takes further steps. They have customised the scan thoroughness (and thus FP ratio) based on the app context. Reliability issues are given heavier weights for mobile applications for example. Facebook also provides an on-call function for all developers to speak with product security engineers about SAST findings immediately on discovery. Having dedicated engineers available to talk about findings instead of a tool output and a link to OWASP helps engineers feel valued and able to operate without context switching.

Improving the Developer Experience

Facebook and Google performed batched scans initially. They had zero findings remediated from those. In my personal experience, batch scan results were largely ignored by development teams unless a Security Engineer was prodding them. A better approach is to introduce the tooling into the standard parts of a developer workflow. Github Dependabot and Snyk have had good success in the Software Composition Analysis space by creating Pull-Requests with dependency updates. Facebook scans the changes being introduced with a pull-request, Google created compile-time checks for findings in addition to PR SAST for code reviews. Having the results appear as part of the review process means developers are focussed on improving the quality of their code, rather than the implementation. Entirely different contexts, where one is receptive to feedback and the other just wants a solution.

None of these solutions involve sending developers to a web portal to download a results file and cross-referencing that to JIRA and their codebase. Also, don’t just raise JIRA tickets. It looks great for project managers, but how do you prioritise 100 or more SAST findings in your backlog? Don’t do this please!

AnakinPadmeBacklog

Improving the Efficiency of SAST

Most companies don’t have the time, skills, or money to design their own programming languages, build static analysis tools, and manufacture hardware. While Facebook can aim for 15 minutes for a differential scan, this is unachievable for most organisations in Australia. Even Google stated that it’s too hard to build an efficient taint analysis scanner for their 2 billion lines of code. So what do they use?

Grep

More specifically, FindBugs, CLang, Linters, etc. They use tooling that is not that different from grep. Line-by-line scans are efficient. When you consider that Google updates their codebase thousands of times a day, you can see why this approach is appealing. Google is presenting research findings next month about implementing compiler checks for DOM sinks. How? I’ll hazard a guess that grep is involved. The main take-away? Instant feedback is king.

Improving the Coverage of SAST

Coverage is the final frontier, and unfortunately where the SAST train must stop. Having a single tool to cover everything will create gaps. So taking Gitlabs approach of throwing every tool at the problem may help with that. But for coverage at least, it’s worth looking at adding other controls for defense in depth. Using a dynamic scanner, basing the architecture on secure patterns, or just throwing penetration testers at it can all help throughout the SDLC.

Conclusion

SAST tooling in Australia has a long way to go, I hope that I’ve covered some of the ways that your organisation can help deal with the traditional stressors in the space. If you’ve got any feedback about the article, or just wanna chat hit me up on linkedIn!

Thanks!

Cole Cornford