Hacker News new | past | comments | ask | show | jobs | submit login

I don't believe that is accurate.

First - if git really didn't care about collision resistance, there wouldn't have been a need to switch to SHA1DC as the hash function. They switched because they care enough that they were willing to accept the performance penalty.

Second - imagine this scenario: a user creates two commits with the same hash, one with a valid change and the second with a malicious one. The collision could be created by playing around with some data in a binary file - so, this is a collision attack not 2nd pre-image. The user then submits the change to the upstream and gets it approved. The user maintains a mirror of the upstream repo into which they place the malicious commit. Anyone that pulls from this mirror will think they have the same code as the upstream, even if they compare hashes.

So don't use an untrusted mirror? I guess - but that is something that should be possible with a strong hash. And if git really didn't want you to do that, it would provide for better ways of tracking where objects were actually pulled from.

Anyway, collision attacks are real and can impact git. They just aren't as bad as a 2nd pre-image attack.




> First - if git really didn't care about collision resistance, there wouldn't have been a need to switch to SHA1DC as the hash function. They switched because they care enough that they were willing to accept the performance penalty.

Git didn't _need_ to switch to SHA1DC, but they did because the cost was minimal and it's still a good idea to defend against known attacks.

> Second - imagine this scenario: a user creates two commits with the same hash, one with a valid change and the second with a malicious one. The collision could be created by playing around with some data in a binary file - so, this is a collision attack not 2nd pre-image. The user then submits the change to the upstream and gets it approved.

This is a general problem with binary files: they're hard to properly review. Having unreviewable files in a repository (binaries, machine-generated configs, etc.) is already a security problem; hash collisions would just be one (very difficult) way of exploiting that problem.

> The user maintains a mirror of the upstream repo into which they place the malicious commit. Anyone that pulls from this mirror will think they have the same code as the upstream, even if they compare hashes.

Having people pull data from an attacker-controlled source is a security issue, regardless of hash collisions.

> So don't use an untrusted mirror? I guess - but that is something that should be possible with a strong hash. And if git really didn't want you to do that, it would provide for better ways of tracking where objects were actually pulled from.

Git was designed for collaboration between trusted parties; collaboration between untrusted parties (e.g. pulling changes from untrusted sources) is a much harder problem that git doesn't pretend to solve.

> Anyway, collision attacks are real and can impact git. They just aren't as bad as a 2nd pre-image attack.

Collision attacks are real, but they have yet to impact git (beyond adopting SHA1DC, I guess), despite how big of a target popular git repositories are.


> Git didn't _need_ to switch to SHA1DC, but they did because the cost was minimal and it's still a good idea to defend against known attacks.

I'm confused with how a SHA1 collision being found is an "attack" if git truly doesn't care about collision resistance.

> This is a general problem with binary files: they're hard to properly review. Having unreviewable files in a repository (binaries, machine-generated configs, etc.) is already a security problem; hash collisions would just be one (very difficult) way of exploiting that problem.

I don't think you can ignore the use case - people do check binaries into git with the expectation that git will keep track of them.

> Git was designed for collaboration between trusted parties; collaboration between untrusted parties (e.g. pulling changes from untrusted sources) is a much harder problem that git doesn't pretend to solve.

Maybe that is how git was designed. But it's not how git is used. People do pull from repos that they don't fully trust. Maybe just to examine a change before throwing it away. What people don't expect is that by pulling from such a source that an unexpected file could get into their repository due to a collision attack. That is why git switched to SHA1DC - if git truly didn't support that use case, they wouldn't have needed to.

> Collision attacks are real, but they have yet to impact git (beyond adopting SHA1DC, I guess), despite how big of a target popular git repositories are.

I agree that collisions attacks are real but aren't a practical issue yet. What I was responding to was your comment:

> I haven't heard of any second-preimage attacks against MD5, much less SHA-1, so mlindner was correct in asserting that MD5 would be fine (assuming 128 bits are enough). See also the analysis in [1].

In that comment, it seems that you were saying that collisions attacks weren't a problem at all. But, it seems like you are saying in your more recent comment that "collision attacks are real"?


> This is a general problem with binary files: they're hard to properly review. Having unreviewable files in a repository (binaries, machine-generated configs, etc.) is already a security problem; hash collisions would just be one (very difficult) way of exploiting that problem.

That's not a problem in general. Eg having a binary bmp in your repository is fine as far as reviews go.


> Git was designed for collaboration between trusted parties; [...]

No.

Git was designed for development of the Linux kernel. Contributors to the Linux kernel are generally not trusted.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: