> Govulncheck analyzes your codebase and only surfaces vulnerabilities that actually affect you, based on which functions in your code are transitively calling vulnerable functions.
This is huge! Every existing vulnerability scanner that I've worked with have just looked at go.sum, which ends up reporting a lot of vulnerabilities that you don't actually care about because your code isn't actually linking to the optional transitive dependency that has the vulnerability in it.
This is really cool to see because this is the #1 problem with current tools (as you said). I call it "alert fatigue" in my head because it's meaningless when you have 100+ vulns to fix but they're 99% unexploitable.
I have a bit of a bone to pick with this space: I've been working on this problem for a few months now (link to repo[0] and blog[1]).
My background is Application Security and, as is often the case with devs, rage fuels me in my desire to fix this space. Well, Log4Shell helped too.
As another comment said, doing this in a language agnostic way is a big PITA and we haven't fully built it yet. We are using SemGrep to do very basic static analysis (see if vulnerable function is ever imported + called). But we're not doing fancy Inter-process taint analysis like CodeQL can.
(We have a big Merkle tree that represents the dependency tree and that's how we are able to make the CI/CD check take only a few seconds because we can pre-compute.)
Anyway, if you have a second to help, we have a GitHub App[2] that you can install to test this out + help us find bugs. It's best at NPM now but we have basic support for other languages (no dep tree analysis yet).
There are so many edge cases with the ways that repos are setup so just have more scans coming in helps a ton. (Well, it breaks stuff, but we already determined that rage sustains me.)
Oh my goodness I hate that so much. Every time I have to explain that go.sum lists every compatible version, not the version baked in.
But this is even an improvement over any other language I've seen. All just flag CVEs in dependent libraries when 99% of the time it's like "to be vulnerable you have to do [really stupid thing]".
Let's hope that vulnerability scanning vendors adopt using this. For my own stuff or minor work things, it's great. When I fall under the specter of officialness, I'll still get popped by the Enterprise Security Scanning Standard Tool.
Prior to 1.17, go.mod was not a complete representation of the dep graph. More broadly, the problem is that programmers expect dependency management tooling to have a lock file, see go.sum, and assume it's a lock file. This isn't a problem with those programmers, it's a problem with Go modules.
> This isn't a problem with those programmers, it's a problem with Go modules.
That seems like a bit of a jump! Maybe there's a better way of building dependency management tooling that doesn't use lock files, it seems strange to tie yourself to this approach just because it's how other tools work.
It isn't in the traditional sense. There isn't anything in its specification that guarantees the things that most people expect lock files to guarantee.
Good to see Govulncheck doing a vulnerable methods analysis for surfacing only the relevant issues. Many app sec vendors do it now for languages like Java and .NET. I originally created the vulnerable methods analysis back in 2015 - https://www.veracode.com/blog/managing-appsec/vulnerable-met... the same idea has been now implemented by WhiteSource (Mend), Snyk etc.
This seems like a lot of work to identify the vulnerable functions that are called transitively. Could this work be reused to perform tree-shaking, so that Go only compiles the code you actually need? (Or, does Go already do this at compilation?)
govulncheck builds a static approximation of the call graph using Variable Type Analysis (VTA), which dynamically tabulates the set of interface method calls and the set of concrete types that each interface value may hold, to discover the set of functions potentially reachable from main. (VTA is a refinement of Rapid Type Analysis (RTA), which holds only one set of concrete types for all interface values in the whole program.) The result should be more precise than the linker.
Go is a compiled language that uses a linker which means that only the functions that are called end up in the final binary. So yes, go does "tree-shaking".
// The third case is handled by looking to see if any of:
// - reflect.Value.Method or MethodByName is reachable
// - reflect.Type.Method or MethodByName is called (through the
// REFLECTMETHOD attribute marked by the compiler).
//
// If any of these happen, all bets are off and all exported methods
// of reachable types are marked reachable.
Basically, if you do certain kinds of reflection, then more code is theoretically reachable and will be included in your binary. In practice, you end up with a large binary in anything that calls into autogenerated APIs.
That is a useful clarification. It seems to explain why the use of the fmt Go's std lib formatting and printing package, seems to pull in so much. Surely, it is performing a fair amount of reflection under the hood.
I should have been more clear that I was referring to external dependencies that are included. I did some research and there’s so much extra stuff included in the go.sum. You’re right that would result in a lot of false positives.
> Just because you don’t link to or call it… if you include vulnerable code you’re still vulnerable.
That makes no sense. If you never use the vulnerable code, how would an attacker access the vulnerability through your system? Especially with DCE, the vulnerability might not even be in the binary.
The only argument I can think of here is with NPM post install hooks. But for Golang, unless you import the code, I'm pretty certain that there is no way to exploit or backdoor an app.
Not necessarily. RCE may start with being able to run ‘something’ that has restrictions such as “can’t inject code that has zero bytes in it” or “injected code can only be X bytes long”.
In such cases, having another vulnerability available may be the easiest way to get rid of those restrictions.
Also, the second vulnerability may be complementary. For example, the first may get you onto the machine, but not out of the sandbox, while the second won’t get you on the machine, but will get you out of the sandbox.
In this case, I think the go linker won’t include the never-called vulnerable function in the executable (it only would if the vulnerability checker were smarter than the linker in detecting never-called code. That’s theoretically possible, but highly unlikely)
Every "generic" vulnerability scanner still needs some language specific knowledge for how to determine what is an included dependency. The better you want to suppress false positives, the more in-depth knowledge the tool will need about each language/runtime.
The end result of this is that all the existing generic scanners just use least common denominator heuristics for determining vulnerable dependencies (i.e. just look at the lock file).
For large teams, this can be a huge waste of time patching vulnerabilities that don't actually apply to your code just because the scanning tool is too stupid to know better.
I'll take Govulncheck (and similar tools for other languages) any day over the mediocre generic tools.
I guess it sounded like you meant a general solution that wasn’t language specific. I’m still not sure if there’s an abstraction you could use to make language aspects marked as vulnerable - the semantics between languages are just so varied.
> Anyway we need the language specific ones first.
No, we have language-agnostic vuln management since decades and a good tool could use a fall back to the traditional method when a language-specific backend is not available.
Until we can figure out how to translate semantics across languages, we have to reinvent almost everything for each language. We can abstract out some things, but not others. As far as "a good use of our time", the alternative is "no vulnerability scanning" or "package level vulnerability scanning" which probably waste more time or expose more risk for most organizations.
This is a great feature! The company I work for has spent a lot of money trying to use Snyk for this purpose, but it really sucks, and chokes on most of our Go repos go.mod files meaning some of our most important repositories are blind for vulnerability scanning.
We don't get the static code analysis feature from Snyk (where vulnerabilities are raised only when they affect us), because this is an optional paid extra. Now we get it for free!
Snyk really need to up their game with their lacklustre Go support to compete with free.
Note: I run a vuln scanning company in this space (our GitHub[0]), so I can add some context to this.
GitHub has this for Dependabot via CodeQL which is a part of their "Advanced Security" package. ($$$, ie "contact us", but I've heard ~$1-1.5k per dev per year roughly)
Other big ones are SonaType, FOSSA, and Snyk (which OP mentioned). There are some smaller vendors too which I can add if anybody is curious.
GitHub CodeQL is by far the best and their tech comes from their acquisition of Semmle/LGTM a few years ago. When I was at Uber, we used Semmle to augment the efforts of the bug bounty and to "scale" our AppSec team to keep up with an ever growing engineering team.
The lack of flexibility with CodeQL and the other proprietary scanning tools is actually the reason why, when I decided to start a security company, I decided to center the company around building publicly on GitHub.
It's harder to make money, at least initially, but it's also the "right" choice to actually push this industry forward. (The thought on how to make money is a hosted SaaS, of course.)
Anyway, I'm happy to answer any questions around this stuff!
This is super helpful for me as a Dapr maintainer (we have a ton of third party integrations we compile into our binary). As others mentioned - other tools can generate a lot of noise. Found and upgraded a vulnerable dependency and then quickly added this check to our CI/CD workflow.
Has anyone tested this? I'm testing it on a project with known vulnerabilities and am getting positive matches when testing against a go binary (govulncheck binary), but not when ran within a directory using: govulncheck .
Modules, Generics, and now Govulncheck are all enterprise-welcome features, hope this can help Golang to gain more 'market share', at present it remains at around 1%(Python 15%, C|C++|Java each about 10%) in programming language TIOBE ranking, it remains to be an uphill battle to move up.
I spent quite sometime learning golang while using c++ at job, now I pressed the pause button on golang and decided to focus more on modern c++ instead, yeah its tooling is lacking comparing to golang, but it's just so widely used and too good on performance, will revisit Golang later.
It's checks against a list of known vulnerabilities. If you search the go.mod or .sum files on github for them, you have an more powerful route, I think.
Even though a search based on go.mod or .sum files should definitely result in more hits, the new vulnerability management should be way more efficient, because it is already filtered for more relevant hits and should thus contain more signal vs noise as it only returns at least transitively used vulns. Am I missing something here?
The vulnerability list is open source and you can get a list of public projects that use each dependency from pkg.go.dev - if you really want a list, there are at least a few ways that are more efficient than scanning GitHub! (As you pointed out above)
Hmm, thought the problem with that was poor fidelity.. from the top comment:
> go.sum… ends up reporting a lot of vulnerabilities that you don't actually care about because your code isn't actually linking to the optional transitive dependency that has the vulnerability in it.
So shall we have a chat about go.sum files yet? Several scanners seem to choke on the fact that go.sum includes several versions of each module, some of which are vulnerable, but basically never actually packaged. Am i missing a reason why go did not go with a lockfile as nearly every other modern language?
Even ignoring proxies, it ensures that you have the same code that you started developing with. If you're depending on a tagged Git repository, for example, someone can just force push the tags and change what commit go.mod is actually pointing at. With go.sum, you're guaranteed to be notified that that happened.
go.mod will exactly tell you what version you have, always. *
* that does not apply if your code is a library; then, other dependency of the "main" app can "update" your dependency. Which sometimes does break things.
However, the go.mod of the main app will deterministically tell what version is used, always.
I never did understand what packages go to go.sum though and what is the logic there. But it is not that important.
This is only true for go 1.17 and above. Prior to that, go.mod wouldn't necessarily list all transitive dependencies. And I don't think this property is actually guaranteed by the relevant specification. The only reliable way to get a "lock file" is to run go list. Unfortunately.
go.mod lists minimum versions. Minimum Version Selection may increase the versions used as required by other packages in the build. go.mod isn't a lock file.
This is huge! Every existing vulnerability scanner that I've worked with have just looked at go.sum, which ends up reporting a lot of vulnerabilities that you don't actually care about because your code isn't actually linking to the optional transitive dependency that has the vulnerability in it.
Thank you, Go team!