Hacker News new | past | comments | ask | show | jobs | submit login

> Govulncheck analyzes your codebase and only surfaces vulnerabilities that actually affect you, based on which functions in your code are transitively calling vulnerable functions.

This is huge! Every existing vulnerability scanner that I've worked with have just looked at go.sum, which ends up reporting a lot of vulnerabilities that you don't actually care about because your code isn't actually linking to the optional transitive dependency that has the vulnerability in it.

Thank you, Go team!




This is really cool to see because this is the #1 problem with current tools (as you said). I call it "alert fatigue" in my head because it's meaningless when you have 100+ vulns to fix but they're 99% unexploitable.

I have a bit of a bone to pick with this space: I've been working on this problem for a few months now (link to repo[0] and blog[1]).

My background is Application Security and, as is often the case with devs, rage fuels me in my desire to fix this space. Well, Log4Shell helped too.

As another comment said, doing this in a language agnostic way is a big PITA and we haven't fully built it yet. We are using SemGrep to do very basic static analysis (see if vulnerable function is ever imported + called). But we're not doing fancy Inter-process taint analysis like CodeQL can.

(We have a big Merkle tree that represents the dependency tree and that's how we are able to make the CI/CD check take only a few seconds because we can pre-compute.)

Anyway, if you have a second to help, we have a GitHub App[2] that you can install to test this out + help us find bugs. It's best at NPM now but we have basic support for other languages (no dep tree analysis yet).

There are so many edge cases with the ways that repos are setup so just have more scans coming in helps a ton. (Well, it breaks stuff, but we already determined that rage sustains me.)

Thank you. climbs off of soap box

0: https://github.com/lunasec-io/lunasec

1: https://www.lunasec.io/docs/blog/the-issue-with-vuln-scanner...

2: https://github.com/marketplace/lunatrace-by-lunasec


Oh my goodness I hate that so much. Every time I have to explain that go.sum lists every compatible version, not the version baked in.

But this is even an improvement over any other language I've seen. All just flag CVEs in dependent libraries when 99% of the time it's like "to be vulnerable you have to do [really stupid thing]".

Let's hope that vulnerability scanning vendors adopt using this. For my own stuff or minor work things, it's great. When I fall under the specter of officialness, I'll still get popped by the Enterprise Security Scanning Standard Tool.


Why would some people even care or look at go.sum where go.mod is clean and self explanatory?


Prior to 1.17, go.mod was not a complete representation of the dep graph. More broadly, the problem is that programmers expect dependency management tooling to have a lock file, see go.sum, and assume it's a lock file. This isn't a problem with those programmers, it's a problem with Go modules.


> This isn't a problem with those programmers, it's a problem with Go modules.

That seems like a bit of a jump! Maybe there's a better way of building dependency management tooling that doesn't use lock files, it seems strange to tie yourself to this approach just because it's how other tools work.


go.mod is the lock file.


It isn't in the traditional sense. There isn't anything in its specification that guarantees the things that most people expect lock files to guarantee.


Well, I do understand that the actual version used there is ambiguous.

Probably the best would be to take the newest from go.sum. That’s what some scanners are doing now.

Filed bugs with 2 vendors over this though.


The newest from go.sum isn't reliably the version used in the dep graph. The only way to get that information reliably is via go list. Unfortunately.


Oh, you are totally right. I read that in the GitLab issue tracking their version of the problem and totally forgot that’s where they landed.


Also don't forget that more than one version of a dependency might be compiled into your application!


Only one major version. And go considers different major versions to be different dependencies.


not true for go


For the curious it's:

    go list -m all


Good to see Govulncheck doing a vulnerable methods analysis for surfacing only the relevant issues. Many app sec vendors do it now for languages like Java and .NET. I originally created the vulnerable methods analysis back in 2015 - https://www.veracode.com/blog/managing-appsec/vulnerable-met... the same idea has been now implemented by WhiteSource (Mend), Snyk etc.


The great thing is that when it becomes part of the toolchain it will also be available for the latest version of go as it is released.

Right now veracode is stuck at go 1.17 support - maybe this will also help being up to date for such vendors more easily


This seems like a lot of work to identify the vulnerable functions that are called transitively. Could this work be reused to perform tree-shaking, so that Go only compiles the code you actually need? (Or, does Go already do this at compilation?)


govulncheck builds a static approximation of the call graph using Variable Type Analysis (VTA), which dynamically tabulates the set of interface method calls and the set of concrete types that each interface value may hold, to discover the set of functions potentially reachable from main. (VTA is a refinement of Rapid Type Analysis (RTA), which holds only one set of concrete types for all interface values in the whole program.) The result should be more precise than the linker.

See:

- https://pkg.go.dev/golang.org/x/tools/go/callgraph/vta

- https://pkg.go.dev/golang.org/x/tools/go/callgraph/rta


Go is a compiled language that uses a linker which means that only the functions that are called end up in the final binary. So yes, go does "tree-shaking".


It's important to read the caveats: https://github.com/golang/go/blob/master/src/cmd/link/intern..., the most important of which is:

  // The third case is handled by looking to see if any of:
  //   - reflect.Value.Method or MethodByName is reachable
  //   - reflect.Type.Method or MethodByName is called (through the
  //     REFLECTMETHOD attribute marked by the compiler).
  //
  // If any of these happen, all bets are off and all exported methods
  // of reachable types are marked reachable.
Basically, if you do certain kinds of reflection, then more code is theoretically reachable and will be included in your binary. In practice, you end up with a large binary in anything that calls into autogenerated APIs.


That is a useful clarification. It seems to explain why the use of the fmt Go's std lib formatting and printing package, seems to pull in so much. Surely, it is performing a fair amount of reflection under the hood.


The fmt package should not put the compiler into conservative mode. If it does, file a bug report.


The autogenerated APIs aren't something as prevalent as one may think.


[flagged]


What? How does a piece of code that isn't compiled into your binary make you vulnerable?


I should have been more clear that I was referring to external dependencies that are included. I did some research and there’s so much extra stuff included in the go.sum. You’re right that would result in a lot of false positives.


> Just because you don’t link to or call it… if you include vulnerable code you’re still vulnerable.

That makes no sense. If you never use the vulnerable code, how would an attacker access the vulnerability through your system? Especially with DCE, the vulnerability might not even be in the binary.


No, this is not at all a valid argument.


The only argument I can think of here is with NPM post install hooks. But for Golang, unless you import the code, I'm pretty certain that there is no way to exploit or backdoor an app.

Or is there something I'm missing?


I guess I just found an interview question. Thanks!


Good way to weed out good developers!


No, I meant anyone who believes this is weeded out.


Well if RCE is possible, then why is it going to matter if you have uncalled vulnerable functions or not? Attacker can execute anything they want.


Not necessarily. RCE may start with being able to run ‘something’ that has restrictions such as “can’t inject code that has zero bytes in it” or “injected code can only be X bytes long”.

In such cases, having another vulnerability available may be the easiest way to get rid of those restrictions.

Also, the second vulnerability may be complementary. For example, the first may get you onto the machine, but not out of the sandbox, while the second won’t get you on the machine, but will get you out of the sandbox.

In this case, I think the go linker won’t include the never-called vulnerable function in the executable (it only would if the vulnerability checker were smarter than the linker in detecting never-called code. That’s theoretically possible, but highly unlikely)


If only it was language-independent I would consider it.

Reinventing supply chain and vulnerability management for each language... not a good use of our time.


Strong disagree.

Every "generic" vulnerability scanner still needs some language specific knowledge for how to determine what is an included dependency. The better you want to suppress false positives, the more in-depth knowledge the tool will need about each language/runtime.

The end result of this is that all the existing generic scanners just use least common denominator heuristics for determining vulnerable dependencies (i.e. just look at the lock file).

For large teams, this can be a huge waste of time patching vulnerabilities that don't actually apply to your code just because the scanning tool is too stupid to know better.

I'll take Govulncheck (and similar tools for other languages) any day over the mediocre generic tools.


> The end result of this is that all the existing generic scanners just use least common denominator heuristics

Wrong. You can have specialized backends and language-agnostic everything-else.


It doesn't seem possible to implement that in a language agnostic way. How do you propose doing such a thing?


Very simple: implement language-specific static analysis backends and a general frontend and vuln management.

Just like every Linux distribution does distributes packages and manages security in a language-agnostic way. Nothing new.

Amazing how people here dismissed my point by downvoting and providing no reasoning.


> language-specific static backends

I guess it sounded like you meant a general solution that wasn’t language specific. I’m still not sure if there’s an abstraction you could use to make language aspects marked as vulnerable - the semantics between languages are just so varied.

Anyway we need the language specific ones first.


> Anyway we need the language specific ones first.

No, we have language-agnostic vuln management since decades and a good tool could use a fall back to the traditional method when a language-specific backend is not available.


I mean language agnostic vuln management that’s capable of marking a particular subset of functionality as vulnerable like this is.


Let's train a neural network to take source code in an arbitrary language as input, and produce a call graph as output. What could go wrong?


Side-note: how do you make a nn output a graph topology? I'm having a hard time imagining how to make a matrix represent that.


Graph Neural Networks! https://distill.pub/2021/gnn-intro/

In a nutshell, you perform NN operations on the nodes and edges of the graph and then send updates across the graph


And then clamp the output to generate vulnerabilities? Could be worth a shot.


Until we can figure out how to translate semantics across languages, we have to reinvent almost everything for each language. We can abstract out some things, but not others. As far as "a good use of our time", the alternative is "no vulnerability scanning" or "package level vulnerability scanning" which probably waste more time or expose more risk for most organizations.


Is it even technically possible to do (edit: useful) vulnerability analysis in a pure black box configuration?


Sure: it's called "fuzz testing".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: