Hacker News new | past | comments | ask | show | jobs | submit login

From the article:

  As you can see, binaries submitted for analysis are
  identified by their MD5 sums and no sandboxed execution is
  recorded if there is a duplicate (thus the shorter time
  delay). This means that if I can create two files with the
  same MD5 sum – one that behaves in a malicious way while the
  other doesn’t – I can “poison” the database of the product
  so that it won’t even try to analyze the malicious sample!
So it's a technique to get the scanner to ignore a malicious binary by constructing a non-malicious one with the same MD5 sum. This would be much harder if the scanner used a SHA-1 hash or similar.



But that's a white list. But I thought anti-virus rather work by black listing.


virustotal.com allows you to upload files to scan with a whole range of anti-virus programs. Before uploading, it will calculate the hash of your file client-side to see if the file should be uploaded or if a previously uploaded (by someone else) file with same hash should be re-scanned with newer versions of the anti-virus.

I don't know which hashing algorithm they use but just as example of a situation where whitelist is not used.


Yes, I think that's what the author was alluding to here, although I'm not sure:

  The approach may work with traditional AV software too as
  many of these also use fingerprinting (not necessarily MD5)
  to avoid wasting resources on scanning the same files over
  and over (although the RC4 encryption results in VT 0/57
  anyway…).


sha256sum or b2sum (BLAKE2b) would be far better than sha1 :)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: