Did they review other Pub/Sub solutions? I'm thinking of NATS in particular.

andonisus · on Nov 29, 2018

We us NATS (and STAN) for real-time IoT metrics and reporting with a high cardinality of subscriptions. The throughput from a single NATS box is very nice (we currently send 100K+ messages/sec using a single EC2 instance running NATS).

tveita · on Nov 29, 2018

It would be very interesting to see a comparison of Pulsar or LogDevice on Twitter scale workloads.

majidazimi · on Nov 29, 2018

Both Pulsar and LogDevice solve the problem of far-away-consumers with kind of segmenting the log and storing each segment in separate machine to reduce disk seeks and page cache pollution (when c1 reads head of the log and c2 reads tail).

So technically they are better than Kafka. But as the article mentions, they just used SSDs to get around this issue in Kafka. By looking at article the only valid reason to switch to Kafka seems to be KStream.

manigandham · on Nov 29, 2018

Pulsar has a much better storage system that scales independently so latency stays low regardless of consumer offset. It can also now tier to S3/cloud storage natively and it has many other features like supporting millions of topics and per-message acknowledgements.

nevi-me · on Nov 29, 2018

If they're looking for a mature solution, Kafka would be more in that line. Plus beyond pubsub, Kafka is well supported as a streaming data source in the Hadoop-esque world

temuze · on Nov 29, 2018

I've been a big fan of Apache Pulsar lately. I like that they separate storage into a separate layer.

sciurus · on Nov 29, 2018

Interestingly, Twitter's in-house system did that too, but they now seem to think it was a mistake because it increased latency.

airfreak · on Nov 29, 2018

It all depends. For example with Apache Pulsar, tailing readers are served from an in-memory cache in the serving layer (the Pulsar brokers) and only catch-up readers end up having to be served from the storage layer (Apache BookKeeper). This is a little different from DistributedLog which always required going to BookKeeper for reads.

Apache BookKeeper can add additional latency to catch-up readers, on top of the extra hop, because the data of multiple topics are combined into each ledger. This means that we lose some performance from sequential reads. This is mitigated in BookKeeper by writing to disk in batches and sorting a ledger by topic so messages of the same topic are found together, but it still involves more jumping around on disk.

Also, BookKeeper allows the nice separation of disk IO. The read and write path are separate and can be served by different disks so you can scale your reads and writes separately to a certain extent.

For all those reasons, I would have loved to have seen Twitter look at Apache Pulsar and compare performance profiles with Apache Kafka.

anhldbk · on Nov 30, 2018

Streamlio published their OpenMessaging benchmarks between Apache Kafka & Apache Pulsar here: https://streaml.io/pdf/Gigaom-Benchmarking-Streaming-Platfor...