An issue I've always had with UUIDs and ULIDs is there isn't a great way to gene...

mmiyer · 2024-08-25T20:41:00.000000Z

That's UUID v5 (uses a sha1 hash of input data).

VWWHFSfQ · 2024-08-25T20:22:47.000000Z

Are you looking for something other than just a custom seed in the RNG?

exe34 · 2024-08-25T20:18:12.000000Z

https://stackoverflow.com/a/64229385

fiddlerwoaroof · 2024-08-25T20:22:37.000000Z

Sure, there are workarounds in various languages, but it would nice to have a standardized hash-based UUID or ULID

IggleSniggle · 2024-08-25T20:28:18.000000Z

If it's a standardized sequence, then that's no different than just 0, 1, 2, 3 but with different names. If you just want a non-sequential but deterministic sequence, then that's every random number generator that accepts a seed value, and being anymore standardized than that makes zero sense.

fiddlerwoaroof · 2024-08-26T01:23:15.000000Z

The problem with autoincrement in this context is you can’t reproduce the right value when replaying the input streams for your stream processing job. Hashing some combination of values and using that as a primary key solves this problem nicely and, when you’re using bitemporal data modeling, makes it easy to correct mistakes. The point of standardization is compatibility, not standardizing the sequence of keys used.

IggleSniggle · 2024-08-26T15:37:51.000000Z

I agree on all points you're making, but you can't standardize on hashing when the data being hashed will vary due to business reasons. I just can't see any way that this can be realistically standardized outside of a single business, maybe even business-unit depending on the kind of company.

Perhaps you mean something like "standardized hash of all columnar data for the table row," but then you're just reinventing elasticsearch/lucene, with all its pros and cons. The power of foreign keys for a RDBS is that they are pointers, and as pointers, the mutability of their underlying data is what makes them powerful. I think I get what you're asking for, but I also think there can be no possible standard that is reasonable unless you have the technology to take a total snapshot of the universe, at which point, why not just measure the universe itself as your database? Perfect storage system.

1986 · 2024-08-25T20:26:46.000000Z

from the article, it sounds like this is V5?

fiddlerwoaroof · 2024-08-25T20:41:57.000000Z

I missed that because I typically am using ULIDs these days. But, yeah, some standardized format for a hash of message data is what I want.

1986 · 2024-08-25T20:24:20.000000Z

why wouldn't you use some sort of collision resistant hashing function on the data to achieve this instead?

zerodensity · 2024-08-25T23:03:20.000000Z

Some systems expect UUIDs so you don't always have that choice.

voidfunc · 2024-08-26T01:25:11.000000Z

v5... I use them all the time.