Hacker News new | past | comments | ask | show | jobs | submit login

We have a log processing system that performs entity resolution and merging of data. It does this without synchronization, out of order, in parallel.

What that means is we may get N records referring to the same 1 entity. We incrementally build that entity as records come in. Regardless of the ordering of the updates, we always end up with the same information in that entity.

In our case this is for security analytics. This lets us process data from arbitrary sources, with arbitrary ingest cadences, and build a single graph of entities and relationships. Every property has a merge strategy.

The exception to this is that we also have immutable properties. These are contract based - we store the first write and assume that the value will never change, and it's up to the client to only mark immutable data as immutable when that is correct.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: