Hacker News new | past | comments | ask | show | jobs | submit login
Disruptor: High performance alternative to bounded queues (2011) [pdf] (lmax-exchange.github.io)
99 points by cjg on July 8, 2016 | hide | past | favorite | 27 comments



See also Aeron[1] which is basically a networked superset of the Disruptor, featuring (IIRC) a just-as-fast machine-local implementation. Made by some of the same people as the LMAX Disruptor. It's not quite ready for general use AFAICT, but it's quickly getting there...

There's also a really informative presentation[2] by Martin Thompson on some of the ideas that they're using to get really low latency and high throughput.

[1] https://github.com/real-logic/Aeron

[2] https://www.infoq.com/presentations/aeron-messaging


Actually, I thought this might be of sufficient general interest that I submitted it as a story: https://news.ycombinator.com/item?id=12059565

(Not really sure about how the whole submit-a-thing works around here, so I just thought I'd mention it.)


And for the lazy, here is Disruptor:

http://lmax-exchange.github.io/disruptor/


I also really liked Trisha's presentation on this https://www.infoq.com/presentations/Concurrent-Programming-U....

Also want to append https://projectreactor.io/ which uses most if not all of the aforementioned technologies too.


I really don't want to put this down. This looks very interesting and is nicely presented. But I have trouble recognizing the main idea(s) behind the "LMAX Disruptor". To me, this all boils down to:

"You get better performance if you implement your event queue via a pre-allocated ring buffer instead of (dynamically allocated) linked-lists or arrays."

Is it really just that? If so, this is nothing new, but quite common in game programming and related fields. For example, see: https://fgiesen.wordpress.com/2010/12/14/ring-buffers-and-qu...

What am I missing?


Its been many years since I saw the presentation but I believe they made a point of mentioning that ring buffers are not a new technology. They mentioned how such data structures are used all the time in network devices, such as switches.

The disruptor is an actual library based on ring buffers (other have already pointed out some other features).

In practical terms, this library has been immensely popular. Within a couple years of disruptor's release, practically every (java based) trading firm was using them. I've even seen c++ implementations, only vaguely resembling this library, being referred to as 'disruptors.'

Beyond the library, the original (and subsequent) presentations by the authors popularized the concepts of "mechanical sympathy" (understand the abstraction layer beneath what you are currently working in) and introduced a whole new set of developers to modern CPUs, thinking about cache lines, etc.


You might want to also see Martin Thompson's old blog called Mechanical Sympathy, and in particular his post on 'false sharing'

http://mechanical-sympathy.blogspot.co.uk/2011/07/false-shar...


Log4J is one such example outside the HFT realm.

https://www.grobmeier.de/log4j-2-performance-close-to-insane...

And Spring Reactor uses the disruptor as it's default async processor.

Disruptor looks like great fit for IoT and msg-relay (think whatsapp) workloads


Ring buffers are also used in web servers, database servers, different kinds of device drivers protocol stacks, constrained embedded devices, etc.


What you are missing is that with the disruptor pattern you can do this lock-less while keeping the order: The producer gets a sequence number before accessing the space in the queue. This seqno guarantees the order of consumption so there will be no reordering of events on the consumer side. After creating the message, the producer commits this id. Meanwhile the consumer waits for the next seqno to become committed and consumes it.

I have implemented a multiple producer, single consumer variant in C for logging and it works quite well. Loggers can write their message concurrently to the ring buffer, but the order will be correct in the log file.


Sounds exactly what Concurrency Kit already does. http://concurrencykit.org/


"Already does"? They're both from 2011. Even if they both do the same thing, do you have a point?


Except Concurrency Kit only supports a single producer.


The paper linked says that the most common use of Disruptor only has one producer.


IIRC, the data structure that they used was not thread-safe. There eventually had to be an interlocked compare exchange operation which slowed throughput. Also, I couldn't imagine this working correctly in out-of-order cores without memory barriers at the very least.


Yes, you need to use CAS and barriers to make it thread safe. I have also used a condvar for the consumer to avoid spinning while waiting for log messages.


The Disruptor is an example of solid engineering - the fact that there's nothing "new" is a point in its favor. Its creators acknowledge that similar approaches are common in network switches, etc. I would say these techniques aren't well known among Java programmers. Not new != widely known.


Mostly yes. Perhaps the thing to note is that the Disruptor moves the logic for manipulating the the indices outside of the ringbuffer itself. Allowing multiple independent consumers to exist.

The idea behind he Disruptor is to move data between threads in a contention free manor.


The ideas reminded me of a Fowler article, and in fact he was just describing the LMAX's architecture: http://martinfowler.com/articles/lmax.html

Maybe that one is clearer for some.


That one explains the concepts in much better way, thanks for sharing!


ArrayBlockingQueue [1] uses a ring buffer backed by a fixed-size array, too. It is really hard to say where the speed-up comes from without seeing the code, but it is certainly not from using a ring buffer because both of the compared implementations do so. I guess they just were more careful when they designed the auxiliary state and the operation on it, for example ArrayBlockingQueue explicitly tracks the number of elements in the queue with a separate field - instead of deriving it from the insert and remove pointers - and therefore this field is contented by readers and writers. Also preallocating entries for all slots to work around Java's lack of support for arrays of value types certainly removes some overhead from the allocator.

EDIT: Just realized that the code is on GitHub - will have a look.

[1] http://grepcode.com/file/repository.grepcode.com/java/root/j...


ArrayBlockingQueue is compared in the performance test results

https://github.com/LMAX-Exchange/disruptor/wiki/Performance-...


That is why I mentioned it in the first place. The paper reads very much like they think using a ring buffer provides a performance advantage over other queue implementations but then they compare two implementations using ring buffers. So their improved performance stems definitely not from using a ring buffer but from implementations details like the synchronization mechanisms used or preallocating and reusing queue entries.


That is an old presentation. The most up to date info can be found at https://github.com/LMAX-Exchange/disruptor/wiki/Introduction


There is a podcast at [1] where the author of the disruptor is interviewed.

1: http://www.se-radio.net/2016/04/se-radio-episode-254-mike-ba...



It's been a while, but didn't the performance turn out to be similar to other data structures in multi-socket scenarios because it requires interlocked -type instructions?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: