> I regret that this is generally false. Once you have async, you have race conditions and reentrancy issues
Can you get more specific please?
My experience is 100% from Ruby where I've worked heavily with threads in the past, and with Async Ruby for the past 18 months.
From what I can tell, threads require a great deal of locks (mutexes) and are frustratingly hard to get right - because of the language-level race conditions.
Async Ruby has been refreshingly easy to write. I have yet to encounter an example of a race condition.
If it helps to bridge the language gap here: from what I know Async Ruby is similar to Go's goroutine model.
await async {} // Await a block that does nothing, essentially yielding back to the reactor.
print(global.a) // If you're unlucky, global.a has changed.
Task 2:
global.a = 2
await async {} // Await a block that does nothing, essentially yielding back to the reactor.
print(global.a)
Now enqueue both tasks.
Depending on scheduling, you could end up printing (1, 1), (1, 2), (2, 1) or (2, 2). This can become much worse if you're awaiting in the middle of filling a data structure.
Feel free to replace `await async {}` with querying your database or a remote API and `global.a` with any variable or data structure that can be shared between the tasks.
This example is, of course, a toy example, but I've had to debug sophisticated code that broke non-deterministically because of such behaviors. The source code of the front-end of Firefox (written in JavaScript) is full of hand-rolled kinda-Mutex implementations to avoid these race conditions.
Thank you for the clarification. You are right, these types of race conditions are possible with Async Ruby.
I find these races relatively easy to spot: yes, global state CAN change when you yield back to the reactor.
IMO thread race conditions are much, much worse. Global state can change AT ANY POINT, because thread scheduler preemptively switches threads. Here's an example:
global.a = 0
thread {
global.a = 1
print(global.a) // a is 1 or 2
}
thread {
global.a = 2
print(global.a) // a is 1 or 2
}
print(global.a) // a is 0, 1 or 2
> I find these races relatively easy to spot: yes, global state CAN change when you yield back to the reactor.
It's good that you can find them easily. In my experience, these changes can creep into your code stealthily and only hit you from behind, months later, when the winds shift.
Some of my traumas include:
- global state that ends up captured by a closure while you didn't realize it was (perhaps because the code you wrote and the code the other dev wrote accidentally shared state);
- mutating hashtables/dictionaries while walking them, without anything in your code looking suspicious;
- and of course accidentally mixing concurrency between two reactors, but luckily, that's not something that happens every day.
> IMO thread race conditions are much, much worse.
IMO, thread race conditions are a case of "it's bad but you knew what you'd get when you signed up for it", while async race conditions are a case of "don't worry, everything is going to be fine (terms and conditions apply, please contact your lawyer if you need clarifications)".
This isn't really an issue with threads though, the exact same issue is present in green thread/fiber implementations; it just so happens that in Async Ruby the GIL saves you from this specific problem due to making variable accesses atomic (as I understand it anyway, I'm not super familiar with the Ruby VM).
In general, green threads/fibers are vulnerable to the exact same shared memory issues as threads, the only benefit to them is that they are, for a certain class of problems, a more efficient concurrency primitive than threads by avoiding context switches out of userspace, and in many cases provide you the ability to plug in your own scheduler if you so desire allowing you to optimize scheduling for your own workload.
If you remove the sleep there's no race because the fiber scheduler never hits a blocking call anywhere, but the sleep stands in for some kind of i/o. And then there's the terrible global variable.
(Conversely this does show you how badly you have to write code to get race conditions with async and the sleep there is important to the generation of the race, which would not be necessary with threads)
And I think if you remove the sleep it doesn't race due to the GIL and single-threadedness of ruby for pure userspace computations?
You avoid race conditions of writing to the same variable but you still can't avoid race conditions where the critical section is longer than 1 statement
e.g. in JS:
foo = await fetch()
bar = await fetch()
doSomething(foo, bar)
You can't be certain foo isn't modified while the second fetch is executing (assuming foo is globally scoped)
There’s still an important difference in that you’re yielding control explicitly by using await, instead of being preemptable everywhere. That’s good enough in most cases.
The problem with function coloring is that it tends to proliferate throughout APIs (since you need to be async to call async), so as things evolve you find yourself needing to await the majority of function calls… this makes it kind of tough to avoid awaiting during critical paths.
That said I still agree that it’s easier to manage than Threads, but something tells me we’re still going to want structured-concurrency versions of the common primitives like rw locks and mutexes, even in async environments.
In any language, avoiding race conditions or unexpected values changes without dedicated primitives or special conditions is relatively contingent on preventing thoughtless non-read usage of "shared" memory.
The problem is that it's often relatively easy to make unsafe writes without realizing, especially since the guarantees of "safe" primitives can be misunderstood. And of course many people don't realize that async code can have race conditions because they don't really understand the details of how async even works, or that some languages make extra guarantees that they're unknowingly depending on.
Having candidates explain the difference between parallel and asynchronous has been a relatively effective first level interview screener for me, especially with more junior roles.
Goroutines are just (lightweight) threads - they can run in parallel. You need locks whenever you access shared data.
AFAIK Ruby’s fibers are cooperatively scheduled and can only yield at function calls. Code that doesn’t make a function call (e.g. incrementing an integer) is safe to run on multiple fibers without locks.
For comparison, Python’s stackless coroutines can safely do anything except `await` without requiring locks.
1. Check for existence of row by a key in a Postgres table
2. If not present, create one
You can have a race condition to 2. You could do the insertion itself as the check to avoid this issue. But regardless this is a race condition that you need to think about in async environments.
True, this is why database libraries/orms can be dangerous - the "normal" db way is to wrap 1,2 in a transaction. Unfortunately, while transactions are often available many db interfaces lures the unwary programmer away from using transactions, because reading/writing the db "looks like" reading/writing a variable/object.
The same thing could happen with any other form of concurrency. In this case you could wrap the two statements in a transaction with the appropriate isolation level. Not sure what it is called — read consistent? Snapshot?
Can you get more specific please?
My experience is 100% from Ruby where I've worked heavily with threads in the past, and with Async Ruby for the past 18 months.
From what I can tell, threads require a great deal of locks (mutexes) and are frustratingly hard to get right - because of the language-level race conditions.
Async Ruby has been refreshingly easy to write. I have yet to encounter an example of a race condition.
If it helps to bridge the language gap here: from what I know Async Ruby is similar to Go's goroutine model.