Hacker News new | past | comments | ask | show | jobs | submit login

Sorry, but no. There are far more files with a given hash than just the one, if the file is longer than the hash. And having multiple hashes doesn't help until the hashes exceed the length of the file.



Chances of getting a collision is higher than getting a good solution, but having multiple hashes does help in increasing the chance at a good solution. With a hash 1 bit less than the length of the file, we put two pigeons inside one hole, and have a 50% chance at picking the right pigeon. The fewer/smaller hashes, the more we get "sorry, but no".


It's easier to just drop the final F bits from the N-bit input stream and, at decompression time, guess what they are than to go through this exercise of generating hashes that have N-F bits in total and hunt for bit streams having those hashes.


Also, I guess the hash algorithm could be specifically tested to work for this particular solution, couldn't it (although only after having spent the $100 of course)?. The problem I guess is to have a script that takes up less space than for the hash to still work without a collision for the particular data. Even with the smallest possible program or script, this probably isn't going to work then.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: