Hacker News new | past | comments | ask | show | jobs | submit login

Maybe you’re already aware, but you glossed over something: Since you’re using the hash to locate/identify the contect (you mentioned Merkle and git), if you support multiple hash functions you need some assurance that the chance of collisions is low across all supported hash functions. For example two identical functions that differ only in the value of their padding bytes (when the input size doesn’t match the block size) can’t coexist.



You are absolutely right. And yes, I am aware.

Location will actually be done by prefixing the hash with the value of the enum for the hash function/settings pair that made the hash.


Since you seem to have done a fair bit of research in this area, do you have any opinions or thoughts about the Multihash format?

https://multiformats.io/multihash/

It fills in some of the blanks in your "prefixing the hash with the value of the enum for the hash" step.


The multihash format is an excellent format that I am tempted to use.

However, there are a two general problems:

* The spec is not done, and it doesn't look like much has been done.

* While I agree with the FAQ that agreeing on a set of hash functions is possible, the format only allows 256 possible hash functions, so it can run out of space, especially since there can be multiple formats of the same function (BLAKE2B and BLAKE2S, for example), and especially since they want to allow non-cryptographic hash functions.

Then specifically for my use case:

* There is the problem brought up by AdamN [1]: if multihash is supported, an obscure hash may be supported, and that may cause problems.

* As an extension of that, once a hash function and set of settings is supported, it's supported forever, so I want to be more picky, and an enum allows me to restrict what is usable.

* By using a 16-bit enum, I have more space to grow.

* And finally, by using an enum, if my software encounters a repo with a enum value greater than all of the values it knows, it knows that that repo needs a later version of the software, since I will only add enum values.

[1]: https://news.ycombinator.com/item?id=38250444




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: