More

apurvamehta · on Nov 1, 2017

From the very same blog post:

> Is this Magical Pixie Dust I can sprinkle on my application?

> No, not quite. Exactly-once processing is an end-to-end guarantee and the application has to be designed to not violate the property as well. If you are using the consumer API, this means ensuring that you commit changes to your application state concordant with your offsets as described here.

I think that is a pretty clear statement that end-to-end exactly once semantics doesn't come for free. It states that there needs to be additional application logic to achieve this and also specifies what must be done.

ploxiln · on Nov 1, 2017

Right. But that's at the end of a separate article, while the post which this HN discussion is about throws around the words "exactly once" a lot more casually. The argument is over the use of the words "exactly once". They should just refer to the feature as "transactions" or "idempotent producer".

apurvamehta · on June 30, 2017

Personally, I would welcome Aphyr trying to break the Kafka EOS guarantees. Like all software, there will be bugs, and the exposure will only make it stronger and more viable.

I write as one of the implementors of exactly once guarantees in Kafka.

apurvamehta · on June 30, 2017

Points 1 & 2. There is no direct communication between a producer and consumer. The producer writes to the broker, the consumer reads from the broker. There is a detailed flow diagram for the producer side operations in the article, and this deck has more of the details of how both the producer and consumer work: https://www.slideshare.net/apurva2/introducing-exactly-once-...

3. Yes, it has been tested empirically. Quoting from the article:

> We wrote over 15,000 LOC of tests, including distributed > tests running with real failures under load and ran them > every night for several weeks looking for problems. This > uncovered all manner of issues, ranging from basic > coding mistakes to esoteric NTP synchronization issues > in our test harness. A subset of these were distributed > chaos tests, where we bring up a full Kafka cluster with > multiple transactional clients, produce message > transactionally, read these messages concurrently, and > hard kill clients and servers during the process to > ensure that data is neither lost nor duplicated.

apurvamehta · on Oct 9, 2013

We did our experiments directly on hardware. I don't think that AWS VMs simulate multiple physical sockets. If they don't then this article will not apply to them.

apurvamehta · on Oct 9, 2013

That would be true if we were using C++. Unfortunately, all our code is in Scala and we use Java NIO libraries to memory map our files. AFAIK, they don't give us the option on using these POSIX calls.

pquerna · on Oct 9, 2013

Cassandra binds to posix_fadvise to do exactly this when writing out new SSTables:

https://github.com/apache/cassandra/blob/trunk/src/java/org/...

apurvamehta · on Oct 9, 2013

Wow.. that's great to know. We will definitely investigate this approach. Thanks for sharing! :)

apurvamehta · on Oct 8, 2013

Thanks, the 400% number is wrong. It was a last minute edit.. I should learn not to do that. I have updated the post to say that the error rates have dropped by 1/4th.

apurvamehta · on Oct 8, 2013

Yikes. I meant that they dropped TO 1/4th the original.

apurvamehta · on Oct 8, 2013

This is exactly right :)

caf · on Oct 8, 2013

Then it dropped by 75%.

apurvamehta · on Oct 8, 2013

Yes. It was a blunder. The post has been updated to reflect this.

apurvamehta · on Oct 8, 2013

Hi, post author here.

> Also, is there a reason not to use large pages directly for the mmap'd sets if you know you're going to have them hot at all times? (I assume they read the entire file on start?)

We could use large pages directly. But, as I mentioned in the article, the performance gains would be negligible compared to the gains that come from having things in memory in the first place. These are not very large memory systems and the page table / TLB miss overhead doesn't seem to be biting us. We are just following the mantra 'pre-mature optimization is the root of all evil' :)

erichocean · on Oct 8, 2013

In my experience, most people don't know they have TLB problems because, effectively, it's always bad.

It's only when you start getting to the metal to see what your hardware is actually capable of that the TLB stands out as a glaring source of inefficiency.

Put another way: yeah, the TLB is making your app slow, but it's doing so always, so you don't notice. Instead, you mistakenly think your hardware is just slower than it really is.

apurvamehta · on July 6, 2012

> $1,200/month or less gets you your own private room in a shared apartment in NYC, Boston, or just about anywhere else.

$1400/month gets a one-bed apartment in Mountain View. I used to pay $700/mo for a private room and bathroom in a 2 bed apartment in Mountain View. Those prices seem over hyped.

apurvamehta · on March 28, 2012

I feel like this is overkill. I have been using RailsReady (https://github.com/joshfng/railsready) to setup several Mac and Ubuntu boxes and it has never been more than one click.

VeejayRampay · on March 28, 2012

To be fair, he said he didn't necessarily want to duplicate the effort of others. I'm sure he will garner any information he can from those projects to help make that app awesome.