Hacker News new | past | comments | ask | show | jobs | submit login
Vulnerability Management for Go (go.dev)
365 points by mfrw on Sept 6, 2022 | hide | past | favorite | 76 comments



> Govulncheck analyzes your codebase and only surfaces vulnerabilities that actually affect you, based on which functions in your code are transitively calling vulnerable functions.

This is huge! Every existing vulnerability scanner that I've worked with have just looked at go.sum, which ends up reporting a lot of vulnerabilities that you don't actually care about because your code isn't actually linking to the optional transitive dependency that has the vulnerability in it.

Thank you, Go team!


This is really cool to see because this is the #1 problem with current tools (as you said). I call it "alert fatigue" in my head because it's meaningless when you have 100+ vulns to fix but they're 99% unexploitable.

I have a bit of a bone to pick with this space: I've been working on this problem for a few months now (link to repo[0] and blog[1]).

My background is Application Security and, as is often the case with devs, rage fuels me in my desire to fix this space. Well, Log4Shell helped too.

As another comment said, doing this in a language agnostic way is a big PITA and we haven't fully built it yet. We are using SemGrep to do very basic static analysis (see if vulnerable function is ever imported + called). But we're not doing fancy Inter-process taint analysis like CodeQL can.

(We have a big Merkle tree that represents the dependency tree and that's how we are able to make the CI/CD check take only a few seconds because we can pre-compute.)

Anyway, if you have a second to help, we have a GitHub App[2] that you can install to test this out + help us find bugs. It's best at NPM now but we have basic support for other languages (no dep tree analysis yet).

There are so many edge cases with the ways that repos are setup so just have more scans coming in helps a ton. (Well, it breaks stuff, but we already determined that rage sustains me.)

Thank you. climbs off of soap box

0: https://github.com/lunasec-io/lunasec

1: https://www.lunasec.io/docs/blog/the-issue-with-vuln-scanner...

2: https://github.com/marketplace/lunatrace-by-lunasec


Oh my goodness I hate that so much. Every time I have to explain that go.sum lists every compatible version, not the version baked in.

But this is even an improvement over any other language I've seen. All just flag CVEs in dependent libraries when 99% of the time it's like "to be vulnerable you have to do [really stupid thing]".

Let's hope that vulnerability scanning vendors adopt using this. For my own stuff or minor work things, it's great. When I fall under the specter of officialness, I'll still get popped by the Enterprise Security Scanning Standard Tool.


Why would some people even care or look at go.sum where go.mod is clean and self explanatory?


Prior to 1.17, go.mod was not a complete representation of the dep graph. More broadly, the problem is that programmers expect dependency management tooling to have a lock file, see go.sum, and assume it's a lock file. This isn't a problem with those programmers, it's a problem with Go modules.


> This isn't a problem with those programmers, it's a problem with Go modules.

That seems like a bit of a jump! Maybe there's a better way of building dependency management tooling that doesn't use lock files, it seems strange to tie yourself to this approach just because it's how other tools work.


go.mod is the lock file.


It isn't in the traditional sense. There isn't anything in its specification that guarantees the things that most people expect lock files to guarantee.


Well, I do understand that the actual version used there is ambiguous.

Probably the best would be to take the newest from go.sum. That’s what some scanners are doing now.

Filed bugs with 2 vendors over this though.


The newest from go.sum isn't reliably the version used in the dep graph. The only way to get that information reliably is via go list. Unfortunately.


Oh, you are totally right. I read that in the GitLab issue tracking their version of the problem and totally forgot that’s where they landed.


Also don't forget that more than one version of a dependency might be compiled into your application!


Only one major version. And go considers different major versions to be different dependencies.


not true for go


For the curious it's:

    go list -m all


Good to see Govulncheck doing a vulnerable methods analysis for surfacing only the relevant issues. Many app sec vendors do it now for languages like Java and .NET. I originally created the vulnerable methods analysis back in 2015 - https://www.veracode.com/blog/managing-appsec/vulnerable-met... the same idea has been now implemented by WhiteSource (Mend), Snyk etc.


The great thing is that when it becomes part of the toolchain it will also be available for the latest version of go as it is released.

Right now veracode is stuck at go 1.17 support - maybe this will also help being up to date for such vendors more easily


This seems like a lot of work to identify the vulnerable functions that are called transitively. Could this work be reused to perform tree-shaking, so that Go only compiles the code you actually need? (Or, does Go already do this at compilation?)


govulncheck builds a static approximation of the call graph using Variable Type Analysis (VTA), which dynamically tabulates the set of interface method calls and the set of concrete types that each interface value may hold, to discover the set of functions potentially reachable from main. (VTA is a refinement of Rapid Type Analysis (RTA), which holds only one set of concrete types for all interface values in the whole program.) The result should be more precise than the linker.

See:

- https://pkg.go.dev/golang.org/x/tools/go/callgraph/vta

- https://pkg.go.dev/golang.org/x/tools/go/callgraph/rta


Go is a compiled language that uses a linker which means that only the functions that are called end up in the final binary. So yes, go does "tree-shaking".


It's important to read the caveats: https://github.com/golang/go/blob/master/src/cmd/link/intern..., the most important of which is:

  // The third case is handled by looking to see if any of:
  //   - reflect.Value.Method or MethodByName is reachable
  //   - reflect.Type.Method or MethodByName is called (through the
  //     REFLECTMETHOD attribute marked by the compiler).
  //
  // If any of these happen, all bets are off and all exported methods
  // of reachable types are marked reachable.
Basically, if you do certain kinds of reflection, then more code is theoretically reachable and will be included in your binary. In practice, you end up with a large binary in anything that calls into autogenerated APIs.


That is a useful clarification. It seems to explain why the use of the fmt Go's std lib formatting and printing package, seems to pull in so much. Surely, it is performing a fair amount of reflection under the hood.


The fmt package should not put the compiler into conservative mode. If it does, file a bug report.


The autogenerated APIs aren't something as prevalent as one may think.


[flagged]


What? How does a piece of code that isn't compiled into your binary make you vulnerable?


I should have been more clear that I was referring to external dependencies that are included. I did some research and there’s so much extra stuff included in the go.sum. You’re right that would result in a lot of false positives.


> Just because you don’t link to or call it… if you include vulnerable code you’re still vulnerable.

That makes no sense. If you never use the vulnerable code, how would an attacker access the vulnerability through your system? Especially with DCE, the vulnerability might not even be in the binary.


No, this is not at all a valid argument.


The only argument I can think of here is with NPM post install hooks. But for Golang, unless you import the code, I'm pretty certain that there is no way to exploit or backdoor an app.

Or is there something I'm missing?


I guess I just found an interview question. Thanks!


Good way to weed out good developers!


No, I meant anyone who believes this is weeded out.


Well if RCE is possible, then why is it going to matter if you have uncalled vulnerable functions or not? Attacker can execute anything they want.


Not necessarily. RCE may start with being able to run ‘something’ that has restrictions such as “can’t inject code that has zero bytes in it” or “injected code can only be X bytes long”.

In such cases, having another vulnerability available may be the easiest way to get rid of those restrictions.

Also, the second vulnerability may be complementary. For example, the first may get you onto the machine, but not out of the sandbox, while the second won’t get you on the machine, but will get you out of the sandbox.

In this case, I think the go linker won’t include the never-called vulnerable function in the executable (it only would if the vulnerability checker were smarter than the linker in detecting never-called code. That’s theoretically possible, but highly unlikely)


If only it was language-independent I would consider it.

Reinventing supply chain and vulnerability management for each language... not a good use of our time.


Strong disagree.

Every "generic" vulnerability scanner still needs some language specific knowledge for how to determine what is an included dependency. The better you want to suppress false positives, the more in-depth knowledge the tool will need about each language/runtime.

The end result of this is that all the existing generic scanners just use least common denominator heuristics for determining vulnerable dependencies (i.e. just look at the lock file).

For large teams, this can be a huge waste of time patching vulnerabilities that don't actually apply to your code just because the scanning tool is too stupid to know better.

I'll take Govulncheck (and similar tools for other languages) any day over the mediocre generic tools.


> The end result of this is that all the existing generic scanners just use least common denominator heuristics

Wrong. You can have specialized backends and language-agnostic everything-else.


It doesn't seem possible to implement that in a language agnostic way. How do you propose doing such a thing?


Very simple: implement language-specific static analysis backends and a general frontend and vuln management.

Just like every Linux distribution does distributes packages and manages security in a language-agnostic way. Nothing new.

Amazing how people here dismissed my point by downvoting and providing no reasoning.


> language-specific static backends

I guess it sounded like you meant a general solution that wasn’t language specific. I’m still not sure if there’s an abstraction you could use to make language aspects marked as vulnerable - the semantics between languages are just so varied.

Anyway we need the language specific ones first.


> Anyway we need the language specific ones first.

No, we have language-agnostic vuln management since decades and a good tool could use a fall back to the traditional method when a language-specific backend is not available.


I mean language agnostic vuln management that’s capable of marking a particular subset of functionality as vulnerable like this is.


Let's train a neural network to take source code in an arbitrary language as input, and produce a call graph as output. What could go wrong?


Side-note: how do you make a nn output a graph topology? I'm having a hard time imagining how to make a matrix represent that.


Graph Neural Networks! https://distill.pub/2021/gnn-intro/

In a nutshell, you perform NN operations on the nodes and edges of the graph and then send updates across the graph


And then clamp the output to generate vulnerabilities? Could be worth a shot.


Until we can figure out how to translate semantics across languages, we have to reinvent almost everything for each language. We can abstract out some things, but not others. As far as "a good use of our time", the alternative is "no vulnerability scanning" or "package level vulnerability scanning" which probably waste more time or expose more risk for most organizations.


Is it even technically possible to do (edit: useful) vulnerability analysis in a pure black box configuration?


Sure: it's called "fuzz testing".


This is a great feature! The company I work for has spent a lot of money trying to use Snyk for this purpose, but it really sucks, and chokes on most of our Go repos go.mod files meaning some of our most important repositories are blind for vulnerability scanning.

We don't get the static code analysis feature from Snyk (where vulnerabilities are raised only when they affect us), because this is an optional paid extra. Now we get it for free!

Snyk really need to up their game with their lacklustre Go support to compete with free.


Note: I run a vuln scanning company in this space (our GitHub[0]), so I can add some context to this.

GitHub has this for Dependabot via CodeQL which is a part of their "Advanced Security" package. ($$$, ie "contact us", but I've heard ~$1-1.5k per dev per year roughly)

Other big ones are SonaType, FOSSA, and Snyk (which OP mentioned). There are some smaller vendors too which I can add if anybody is curious.

GitHub CodeQL is by far the best and their tech comes from their acquisition of Semmle/LGTM a few years ago. When I was at Uber, we used Semmle to augment the efforts of the bug bounty and to "scale" our AppSec team to keep up with an ever growing engineering team.

The lack of flexibility with CodeQL and the other proprietary scanning tools is actually the reason why, when I decided to start a security company, I decided to center the company around building publicly on GitHub.

It's harder to make money, at least initially, but it's also the "right" choice to actually push this industry forward. (The thought on how to make money is a hosted SaaS, of course.)

Anyway, I'm happy to answer any questions around this stuff!

0: https://github.com/lunasec-io/lunasec


This is amazing! And only raises relevant issues so we avoid alert fatigue and can actually solve whatever real problems are flagged.


This is super helpful for me as a Dapr maintainer (we have a ton of third party integrations we compile into our binary). As others mentioned - other tools can generate a lot of noise. Found and upgraded a vulnerable dependency and then quickly added this check to our CI/CD workflow.

https://github.com/dapr/components-contrib/pull/2054


Has anyone tested this? I'm testing it on a project with known vulnerabilities and am getting positive matches when testing against a go binary (govulncheck binary), but not when ran within a directory using: govulncheck .


The pattern to check the package and all subpackages for pretty much all Go tools is "./...". So:

  govulncheck ./...


Modules, Generics, and now Govulncheck are all enterprise-welcome features, hope this can help Golang to gain more 'market share', at present it remains at around 1%(Python 15%, C|C++|Java each about 10%) in programming language TIOBE ranking, it remains to be an uphill battle to move up.

I spent quite sometime learning golang while using c++ at job, now I pressed the pause button on golang and decided to focus more on modern c++ instead, yeah its tooling is lacking comparing to golang, but it's just so widely used and too good on performance, will revisit Golang later.


Please, please don't use tiobe for programming language comparisons. https://blog.nindalf.com/posts/stop-citing-tiobe/, etc.


well google does give multiple rankings, the exact number differ, but the basic order remains. some top hits:

    https://spectrum.ieee.org/top-programming-languages/
    https://www.cleveroad.com/blog/programming-languages-ranking/
    https://distantjob.com/blog/programming-languagesrank/
    https://statisticstimes.com/tech/top-computer-languages.php


2 of these refer to TIOBE and another is a 404

The IEEE Spectrum one seems independent


It's a great theoretical idea but in terms of practice unless all third-party scanners adopt the same process it won't be relevant I suspect.

It also places all of your trust in the accuracy of the data submitted to govulncheck.


What's the difference between this a https://github.com/securego/gosec?


I hope they will release a rss feed for the database


Attackers now scrambling onto Github to run this against open projects


It's checks against a list of known vulnerabilities. If you search the go.mod or .sum files on github for them, you have an more powerful route, I think.


Even though a search based on go.mod or .sum files should definitely result in more hits, the new vulnerability management should be way more efficient, because it is already filtered for more relevant hits and should thus contain more signal vs noise as it only returns at least transitively used vulns. Am I missing something here?


The vulnerability list is open source and you can get a list of public projects that use each dependency from pkg.go.dev - if you really want a list, there are at least a few ways that are more efficient than scanning GitHub! (As you pointed out above)


Hmm, thought the problem with that was poor fidelity.. from the top comment:

> go.sum… ends up reporting a lot of vulnerabilities that you don't actually care about because your code isn't actually linking to the optional transitive dependency that has the vulnerability in it.

Not a Go dev myself though.


Neither of those files reliably represents the actual dep graph of your program.


Thanks. Looks good!


So shall we have a chat about go.sum files yet? Several scanners seem to choke on the fact that go.sum includes several versions of each module, some of which are vulnerable, but basically never actually packaged. Am i missing a reason why go did not go with a lockfile as nearly every other modern language?


It's not a lockfile. It contains hashes of packages so that you could fetch them from any proxy and not worry about proxies doing funny business.


Even ignoring proxies, it ensures that you have the same code that you started developing with. If you're depending on a tagged Git repository, for example, someone can just force push the tags and change what commit go.mod is actually pointing at. With go.sum, you're guaranteed to be notified that that happened.


great point. I am glad they took this into consideration.


go.mod IS a lockfile.

go.mod will exactly tell you what version you have, always. *

* that does not apply if your code is a library; then, other dependency of the "main" app can "update" your dependency. Which sometimes does break things.

However, the go.mod of the main app will deterministically tell what version is used, always.

I never did understand what packages go to go.sum though and what is the logic there. But it is not that important.


This is only true for go 1.17 and above. Prior to that, go.mod wouldn't necessarily list all transitive dependencies. And I don't think this property is actually guaranteed by the relevant specification. The only reliable way to get a "lock file" is to run go list. Unfortunately.


go.mod lists minimum versions. Minimum Version Selection may increase the versions used as required by other packages in the build. go.mod isn't a lock file.




Consider applying for YC's first-ever Fall batch! Applications are open till Aug 27.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: