Hacker News new | past | comments | ask | show | jobs | submit login

That is something I wanted to know, does IPFS guarantee that same two files have same two IPFS URLs / hash links?

Otherwise, someone sharing same data again, because it will be in different IPFS folder won't be actually discoverable as same data.




Yes, a file's hash is only based on its contents. The way I understand it, a file doesn't really live in a directory, it's more like a directory (which is a kind of file itself) references files. So the same file can be in two directories, yet it'll have the same URL/hash. And if you "add" files to a directory, you're really uploading a separate copy of the dir that'll have a different hash.

I checked myself on this, but someone else might want to check me cause I'm not an expert.


This is generally true, though it’s possible to encode the same data into a slightly different shaped DAG to optimise for eg video streaming performance afaiu (balanced vs imbalanced). UnixFS vs raw bytes may also be different but I’m not 100%


From the fs's point of view, these are different file contents. But yeah, there's nothing stopping you from pinning something different that looks the same to a person.


Once decoded they would be the same file contents - imagine one DAG where the depth is log(n) and it’s a perfectly balance tree, and another where the depth of the left-hand branch is 1, right hand branch contains another subtle with left-hand size 1 etc etc etc.

The leaves are the same in both cases, so the file contents are the same, though the latter is quicker to stream (though not to verify) and the CIDs will be different


Yes, IPFS hashes individual "blocks" (pieces) of files. If two files have the same content, they will share block hashes.


Basically it depends on specific settings that can be changed in the client as to how the individual block pieces are encoded and therefore what the resulting hash ends up being. So no there's no inherent guarantee but you may get lucky with some copies of the same file.


Caveat: The other comments mention the file's contents being the only dependency on the hash, but the algo used to hash would also need to be the same. If the hash algo changes in two cases, the same content would have a different hash in those two cases.


In this case, would pinning the file make it accessible from either hash? I'd expect it to, but idk, I've only ever seen sha256 hashes on IPFS.


Kinda. Shooting from the hip based on fuzzy gatherings from IPFS usage here, but as I understand it: The leaf-level data blocks will be shared between the Merkle trees, but at least the tip (the object a given hash actually refers to) and maybe some of the other structural information will be different.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: