There's a ton of positive value to having real time data. Just off the top of my head: 1) If you've instrumented something incorrectly ... 2) Even worse, if you've accidentally messed up ... 3) You can observe significant changes ... 4) It allows you to have more confidence in deploying multiple times ...
Embracing both yours, and the author's perspectives I think I have a compromise. It seems to me the author's gripe is more specifically that his customers want to "do the same large window analytics in realtime" whereas here you highlight particular usecases where realtime analytics are simple.
In a previous life I was producing near-realtime analytics for network activity for populations numbering 10s of millions, all the time, in realtime, except the definition of realtime was stretched to allow a 15 minute latency.
Reducing this 15 minute latency was oft requested e.g to give help-desk operators an instant view of what is happening for a customer or, to automate responses to particular well-known system issues in realtime.
It wouldn't have been impossible to reduce our 15 minute window but given various architectural limitation of the time it would have been expensive and upon analysing these customer requirements it made more sense to simply siphon off the realtime event-data at our monitoring points into a parallel dataflow.
TL;DR you do different things with realtime vs aggregate data. Do you really need the expense of a system that does both equally well?
Embracing both yours, and the author's perspectives I think I have a compromise. It seems to me the author's gripe is more specifically that his customers want to "do the same large window analytics in realtime" whereas here you highlight particular usecases where realtime analytics are simple.
In a previous life I was producing near-realtime analytics for network activity for populations numbering 10s of millions, all the time, in realtime, except the definition of realtime was stretched to allow a 15 minute latency.
Reducing this 15 minute latency was oft requested e.g to give help-desk operators an instant view of what is happening for a customer or, to automate responses to particular well-known system issues in realtime.
It wouldn't have been impossible to reduce our 15 minute window but given various architectural limitation of the time it would have been expensive and upon analysing these customer requirements it made more sense to simply siphon off the realtime event-data at our monitoring points into a parallel dataflow.
TL;DR you do different things with realtime vs aggregate data. Do you really need the expense of a system that does both equally well?