It seems to be one of those things that you probably don't need. And when you do...

vidarh · on Feb 4, 2019

Parts of it can be useful, but you don't need to split out an event bus to get auditing for example. As you say, you can avoid that until/unless you need to scale writes.

In the meantime, you can look for inbound data that naturally correspond to immutable events and apply some of the ideas to that. E.g. that form a user submits? It's reasonably an immutable event. Many of them won't matter to you, because you'll never care to audit it. But some might.

E.g. we have projections of financials being submitted by third parties. Being able to go back and audit how original form submissions relate to changes in other system state is useful, or just being able to re-run old reports after fixing bugs and confirming that the reports show what they should before/after certain events. So instead of just storing the end state, we're increasingly looking to store the original external signals that triggered those changes, and build transformations as views over that event log, and then where we need it only drive transformations to tables we don't event the same way, often with a suitable reference to the source event(s).

It avoids the problems in the article for the most part (some, such as changes in the structure of the events will always be an issue), but gets enough of the benefits to be worth it, because it's only applied to data we have that it genuinely fits (where we have clear, natural event sources, often but not always external submissions of data) where we have a need (whether for complexity reasons or because of external auditing requirements) to be able to get past views of data.

barrkel · on Feb 4, 2019

I think you also need it when scaling reads becomes painful.

Reads that have different patterns, specifically, the kinds of patterns that can't be indexed easily because they need denormalization to generate all the indexed expressions. Or you need to read a time series, a snapshot at a point in time, or the latest version of the data, all from different places under different loads - analytic, machine learning, transactional.

One user needs to read across all the data over all time; another user wants super-fast scrollable access to user-customized sorts of a subset of the latest data. The user-configurability of the sort is what defeats the kinds of indexing you get in a traditional RDBMS. The obvious way to get this is a lambda architecture: have an immutable append-only system of record which contains all the data, and build the other views out of it. It's a small step from there to event sourcing.