Thanks for the long and interesting article. Two comments, one an observation I'd be interested in hearing your response to, the second a bit of a devil's advocate question.
First, in your write-up, you describe data as singular, but of course, it's often a tuple. To take a concrete example, let's say that Sally, in order to make her move from Atlanta to Chicago, goes to U-Haul's website and reserves a 14' truck for Monday. She thinks a bit more and decides that she needs a bigger truck, so she goes to the site and changes her reservation to a 17' truck. After even more thinking, she decides she really won't be ready by Monday, so she changes her reservation date to Tuesday.
Let's call these three moments t0, t1 and t2. So her order has changed from:
t0: 14' truck, Monday
t1: 17' truck, Monday
t2: 17' truck, Tuesday
BUT, it so happens that U-Haul has implemented a datastore like you suggest, one that is only "eventually consistent", and when Sally made her change at t2, the node she was talking to hadn't heard yet about the switch from a 14' to a 17' truck. So the actual tuples that were recorded were:
t0: 14' truck, Monday
t1: 17' truck, Monday
t2: 14' truck, Tuesday
Now you're in a pickle: how can you tell that the record stored at t2 was partially out of date, verses the possibility that maybe Sally really did want to change back to a 14' truck at the same time she changed the date to Tuesday.
To solve this, you need to do two things: the table that records new items is different from the ones that record updates, in that the new item tables contain full tuples, while the update tables are radically normalized, to the extent that they'll usually have just two columns: a key and an atomic replacement value. So what we'd actually record is:
t0: 14' truck, Monday in the "new reservation" table
t1: 17' truck in the "reservation item change" table
t2: Tuesday in the "reservation date change" table
I don't have any knee-jerk opposition to a schema like that, but it seems to me you'll run into two issues:
- all the usual issues with extreme normalization, like performance and massive and complicated joins. And since, as I understand it, your realtime iterative db is consuming the same data stream as your main store, it will face this same performance penalty, at least in its input.
- your schema darn well better be in BCNF, or enforcing data constraints across the hyper-normalized update tables is going to get very hairy very fast. Like here, let's say that U-Haul has a policy that it never rents 17' trucks on Tuesdays (I don't know why, maybe they wash them on Tuesdays). Since when Sally made her switch to Tuesday, the node she was talking to thought she was still renting a 14' truck, it was happy to record the change. But when your batch process comes along and tries to resolve the two updates, what does it do? There's no obvious answer (going with the last consistent state -- 17' for Monday -- is certainly wrong, as we know that's no longer what Sally wants). In other words, you're back in the "read repair" complexity and hell you thought you'd avoided.
Second point ... eh, I think I've well overspent my word budget. In summary: a devil's advocate would say that the realtime db you've called just an "optimization" is, in fact, the real db, since if the realtime and batch dbs differ, at least at any time other than when a new batch update lands, then you'll go with the value from the realtime db (if, during normal operation, batch says a count is 4 but realtime says 5, you report 5). So what you've really done is attached a fairly massive auditing system onto the same-old database design. Its job is to periodically check that no inconsistencies have been introduced between nodes. It's a clever auditing system, in that the immutable model means it may have a leg up on deciding what the right result is, but still just an auditing system.
tl;dr: won't this lead to hyper-normalization, and is it really a new db design or just a new kind of auditing system
I would store the reservation date and truck length as separate data objects in the batch layer. You would tie them together with the reservation id. It isn't a big deal to do those joins since the batch system is good at that kind of large-scale computation and you've already accepted high latency in the batch layer.
The realtime db only accounts for the last few hours of data. Unless you're doing a query that only needs the last few hours of data, you need to do a merge between the batch view and the realtime view. We constantly flush old data out of our realtime db so as to keep it small and fast.
First, in your write-up, you describe data as singular, but of course, it's often a tuple. To take a concrete example, let's say that Sally, in order to make her move from Atlanta to Chicago, goes to U-Haul's website and reserves a 14' truck for Monday. She thinks a bit more and decides that she needs a bigger truck, so she goes to the site and changes her reservation to a 17' truck. After even more thinking, she decides she really won't be ready by Monday, so she changes her reservation date to Tuesday.
Let's call these three moments t0, t1 and t2. So her order has changed from:
t0: 14' truck, Monday
t1: 17' truck, Monday
t2: 17' truck, Tuesday
BUT, it so happens that U-Haul has implemented a datastore like you suggest, one that is only "eventually consistent", and when Sally made her change at t2, the node she was talking to hadn't heard yet about the switch from a 14' to a 17' truck. So the actual tuples that were recorded were:
t0: 14' truck, Monday
t1: 17' truck, Monday
t2: 14' truck, Tuesday
Now you're in a pickle: how can you tell that the record stored at t2 was partially out of date, verses the possibility that maybe Sally really did want to change back to a 14' truck at the same time she changed the date to Tuesday.
To solve this, you need to do two things: the table that records new items is different from the ones that record updates, in that the new item tables contain full tuples, while the update tables are radically normalized, to the extent that they'll usually have just two columns: a key and an atomic replacement value. So what we'd actually record is:
t0: 14' truck, Monday in the "new reservation" table
t1: 17' truck in the "reservation item change" table
t2: Tuesday in the "reservation date change" table
I don't have any knee-jerk opposition to a schema like that, but it seems to me you'll run into two issues:
- all the usual issues with extreme normalization, like performance and massive and complicated joins. And since, as I understand it, your realtime iterative db is consuming the same data stream as your main store, it will face this same performance penalty, at least in its input.
- your schema darn well better be in BCNF, or enforcing data constraints across the hyper-normalized update tables is going to get very hairy very fast. Like here, let's say that U-Haul has a policy that it never rents 17' trucks on Tuesdays (I don't know why, maybe they wash them on Tuesdays). Since when Sally made her switch to Tuesday, the node she was talking to thought she was still renting a 14' truck, it was happy to record the change. But when your batch process comes along and tries to resolve the two updates, what does it do? There's no obvious answer (going with the last consistent state -- 17' for Monday -- is certainly wrong, as we know that's no longer what Sally wants). In other words, you're back in the "read repair" complexity and hell you thought you'd avoided.
Second point ... eh, I think I've well overspent my word budget. In summary: a devil's advocate would say that the realtime db you've called just an "optimization" is, in fact, the real db, since if the realtime and batch dbs differ, at least at any time other than when a new batch update lands, then you'll go with the value from the realtime db (if, during normal operation, batch says a count is 4 but realtime says 5, you report 5). So what you've really done is attached a fairly massive auditing system onto the same-old database design. Its job is to periodically check that no inconsistencies have been introduced between nodes. It's a clever auditing system, in that the immutable model means it may have a leg up on deciding what the right result is, but still just an auditing system.
tl;dr: won't this lead to hyper-normalization, and is it really a new db design or just a new kind of auditing system
edit: formatting, tl;dr