Hacker News new | past | comments | ask | show | jobs | submit login

There's lots of words in this article (and design doc: http://goo.gl/EsdTXo ) but it's missing a very simple key points?

1. State diagram: What are the messages exchanged between Producer and Consumer (and how many round trips to confirm a message)

2. At what point does each side consider the message to have been delivered?

3. Has this been tested empirically? (i.e: Setup producer, consumer, and partition/kill each side randomly to see if messages get lost)

The one thing I don't understand is the following. The two parties communicate by message passing. At some point the message will transition to a new state (i.e: delivered). That transition cannot happen on both sides at the same time. So how do you handle the failure of sending of the last message? Do you stage messages until after the timeout period has passed?




Points 1 & 2. There is no direct communication between a producer and consumer. The producer writes to the broker, the consumer reads from the broker. There is a detailed flow diagram for the producer side operations in the article, and this deck has more of the details of how both the producer and consumer work: https://www.slideshare.net/apurva2/introducing-exactly-once-...

3. Yes, it has been tested empirically. Quoting from the article:

> We wrote over 15,000 LOC of tests, including distributed > tests running with real failures under load and ran them > every night for several weeks looking for problems. This > uncovered all manner of issues, ranging from basic > coding mistakes to esoteric NTP synchronization issues > in our test harness. A subset of these were distributed > chaos tests, where we bring up a full Kafka cluster with > multiple transactional clients, produce message > transactionally, read these messages concurrently, and > hard kill clients and servers during the process to > ensure that data is neither lost nor duplicated.


The blog assumes you already know a bit about the Kafka architecture & is more of a marketing announcement than documentation.

This has been under development for a long time in the open, you can track the features history here https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+E...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: