Hacker News new | past | comments | ask | show | jobs | submit login

It really comes down to the latency and throughput requirements. Projects are architected differently for different latency expectation and different throughput expectation.

In event processing there's a continuum of expected latencies from batch processing to realtime. Batch processing is typically running reports over large volume of events (good for throughput). Hadoop is a good example. On the other end, sub-second realtime report is possible with Heron/Storm. Spark is kind of in the middle with hybrid mini-batching. Reportedly Twitter has used Heron/Storm to track word counts in all the tweets to find trending topics, where the latency between a new tweet coming in to the word counts updated over the whole network is in 100s milliseconds.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: