Obviously that example is a chosen prefix collision, but this data is coming from an untrusted source after all, so there’s really nothing to stop the attacker choosing the prefix in advance and then publicly shredding trust in the hash at a later date. In practice it sounds like you’d also have a hash of the complete file system, but at this point you’d have to question what advantage there is to using MD5 at all. Attacks never get worse, only ever get better, and the last thing you’d want is for the dam to burst during a lengthy and important investigation.
But if the algorithm allows you to find hash collisions, you can’t guarantee that the image didn’t change based on the MD5 hash value? e.g. https://natmchugh.blogspot.com/2014/11/three-way-md5-collisi...
Obviously that example is a chosen prefix collision, but this data is coming from an untrusted source after all, so there’s really nothing to stop the attacker choosing the prefix in advance and then publicly shredding trust in the hash at a later date. In practice it sounds like you’d also have a hash of the complete file system, but at this point you’d have to question what advantage there is to using MD5 at all. Attacks never get worse, only ever get better, and the last thing you’d want is for the dam to burst during a lengthy and important investigation.