Hacker News new | past | comments | ask | show | jobs | submit login

One word: atomics.

A simple example would be concurrently writing into a shared queue. To make it really simple, let's assume that this queue's buffer can only be written once, and then the program has to exit.

If we have a buffer in this queue that can hold 20 messages, and an atomic integer that represents the current index, then we could have two (or however many) threads writing into it at the same time by just doing "index = myAtomic.fetch_add(1)", which will atomically add one to the index, then return the previous value. Atomics are supported at a hardware level, so they're generally pretty efficient, and there definitely is no lock like a Mutex involved here. In the end, both threads are able to write into shared memory without conflicting with each other. Using one or two more atomic integers, we could support having one or more readers reading off of the queue at the same time that we're writing to it, and we could get to the point where we're able to start reusing the buffer in a circular fashion.




It’s when you want to synchronize multiple atomics it gets complicated.

With a mutex you can lock for an entire ”transaction” where you fetch a unique index and write your value to the queue. With just atomics you can get a unique index into the queue concurrently with other threads, sure, but how do you mark it as ready to be unqueued once you are done writing your data?

Thats where lock free data structures come into play. They have solved this intricate synchronization between multiple atomics so you don’t have to.


While what you said is all true, its not turtles all the way down.

Eventually when it comes down to the hardware level the processor will have to assert #LOCK (or whatever mechanism it uses). So arguably even atomics aren't "lock" free, somewhere along the line some piece of hardware will have to block all other hardware to do its thing. DRAM can only read one thing at a time (and has to write it back afterwards).


The hardware guarantees forward progress at the very least and possibly some amount t of fairness. So the underlying hardware is at least lock free and possibly wait free. Usually a CPU cannot hold a cache line indefinitely, another CPU can always sneak in and steal it.


Yes, you're right. It's turtles until RAM locking mechanisms.

RAM does the locking inside it, and expose it as "compare-and-set" and "compare-and-swap" etc primitives. Various computing languages that use those primitives usually call that "atomic data structures". The thing is that RAM is way faster in that regard than locking on user/kernel level, so for program it looks like there's no locks. But atomics indeed do slow down your program, if just a little, because "compare-and-set" is still slower than just "set".


The RAM does not know anything about locking, it's the cache coherence protocol. The CPU will request a cache line in exclusive state, it does the operation on the memory and ensures that during it has always been exclusive to that core. After all the other cores will observe the change (and depending on the memory model+ordering, the operations that have/will happen before and after).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: