> I regret that this is generally false. Once you have async, you have race cond...

Yoric · on Oct 30, 2021

Here's a short toy example in pseudo-syntax.

Task 1:

global.a = 1

await async {} // Await a block that does nothing, essentially yielding back to the reactor.

print(global.a) // If you're unlucky, global.a has changed.

Task 2:

global.a = 2

await async {} // Await a block that does nothing, essentially yielding back to the reactor.

print(global.a)

Now enqueue both tasks.

Depending on scheduling, you could end up printing (1, 1), (1, 2), (2, 1) or (2, 2). This can become much worse if you're awaiting in the middle of filling a data structure.

Feel free to replace `await async {}` with querying your database or a remote API and `global.a` with any variable or data structure that can be shared between the tasks.

This example is, of course, a toy example, but I've had to debug sophisticated code that broke non-deterministically because of such behaviors. The source code of the front-end of Firefox (written in JavaScript) is full of hand-rolled kinda-Mutex implementations to avoid these race conditions.

brunosutic · on Oct 30, 2021

Thank you for the clarification. You are right, these types of race conditions are possible with Async Ruby.

I find these races relatively easy to spot: yes, global state CAN change when you yield back to the reactor.

IMO thread race conditions are much, much worse. Global state can change AT ANY POINT, because thread scheduler preemptively switches threads. Here's an example:

    global.a = 0

    thread {
      global.a = 1
      print(global.a) // a is 1 or 2
    }

    thread {
      global.a = 2
      print(global.a) // a is 1 or 2
    }

    print(global.a) // a is 0, 1 or 2

Yoric · on Oct 30, 2021

> I find these races relatively easy to spot: yes, global state CAN change when you yield back to the reactor.

It's good that you can find them easily. In my experience, these changes can creep into your code stealthily and only hit you from behind, months later, when the winds shift.

Some of my traumas include:

- global state that ends up captured by a closure while you didn't realize it was (perhaps because the code you wrote and the code the other dev wrote accidentally shared state);

- mutating hashtables/dictionaries while walking them, without anything in your code looking suspicious;

- and of course accidentally mixing concurrency between two reactors, but luckily, that's not something that happens every day.

> IMO thread race conditions are much, much worse.

IMO, thread race conditions are a case of "it's bad but you knew what you'd get when you signed up for it", while async race conditions are a case of "don't worry, everything is going to be fine (terms and conditions apply, please contact your lawyer if you need clarifications)".

bitwalker · on Oct 30, 2021

This isn't really an issue with threads though, the exact same issue is present in green thread/fiber implementations; it just so happens that in Async Ruby the GIL saves you from this specific problem due to making variable accesses atomic (as I understand it anyway, I'm not super familiar with the Ruby VM).

In general, green threads/fibers are vulnerable to the exact same shared memory issues as threads, the only benefit to them is that they are, for a certain class of problems, a more efficient concurrency primitive than threads by avoiding context switches out of userspace, and in many cases provide you the ability to plug in your own scheduler if you so desire allowing you to optimize scheduling for your own workload.

brunosutic · on Oct 30, 2021

Thank you for sharing your perspective in a non-conflicting way Yoric.

It seems you have more experience than me with async-related work. I'll keep an eye on this kind of scenario you brought up.

lamontcg · on Oct 30, 2021

Here's a working race condition with async:

https://gist.github.com/39409838626b22433f8bf1878276b3c1

Prints 1 instead of 1000

If you remove the sleep there's no race because the fiber scheduler never hits a blocking call anywhere, but the sleep stands in for some kind of i/o. And then there's the terrible global variable.

(Conversely this does show you how badly you have to write code to get race conditions with async and the sleep there is important to the generation of the race, which would not be necessary with threads)

And I think if you remove the sleep it doesn't race due to the GIL and single-threadedness of ruby for pure userspace computations?

ricardobeat · on Oct 30, 2021

If it did not print 1 then you'd have a problem.

Not really a race condition, just the semantics of async. You'll get the same with goroutines, javascript async/await, and other constructs.

trinovantes · on Oct 30, 2021

You avoid race conditions of writing to the same variable but you still can't avoid race conditions where the critical section is longer than 1 statement

e.g. in JS:

    foo = await fetch()
    bar = await fetch()
    doSomething(foo, bar)

You can't be certain foo isn't modified while the second fetch is executing (assuming foo is globally scoped)

wonnage · on Oct 30, 2021

There’s still an important difference in that you’re yielding control explicitly by using await, instead of being preemptable everywhere. That’s good enough in most cases.

ninkendo · on Oct 31, 2021

The problem with function coloring is that it tends to proliferate throughout APIs (since you need to be async to call async), so as things evolve you find yourself needing to await the majority of function calls… this makes it kind of tough to avoid awaiting during critical paths.

That said I still agree that it’s easier to manage than Threads, but something tells me we’re still going to want structured-concurrency versions of the common primitives like rw locks and mutexes, even in async environments.

Glyptodon · on Oct 30, 2021

In any language, avoiding race conditions or unexpected values changes without dedicated primitives or special conditions is relatively contingent on preventing thoughtless non-read usage of "shared" memory.

The problem is that it's often relatively easy to make unsafe writes without realizing, especially since the guarantees of "safe" primitives can be misunderstood. And of course many people don't realize that async code can have race conditions because they don't really understand the details of how async even works, or that some languages make extra guarantees that they're unknowingly depending on.

Having candidates explain the difference between parallel and asynchronous has been a relatively effective first level interview screener for me, especially with more junior roles.

pansa2 · on Oct 30, 2021

> Async Ruby is similar to Go's goroutine model.

Goroutines are just (lightweight) threads - they can run in parallel. You need locks whenever you access shared data.

AFAIK Ruby’s fibers are cooperatively scheduled and can only yield at function calls. Code that doesn’t make a function call (e.g. incrementing an integer) is safe to run on multiple fibers without locks.

For comparison, Python’s stackless coroutines can safely do anything except `await` without requiring locks.

teaearlgraycold · on Oct 31, 2021

This is an issue I've had with NodeJS recently.

    1. Check for existence of row by a key in a Postgres table
    2. If not present, create one

You can have a race condition to 2. You could do the insertion itself as the check to avoid this issue. But regardless this is a race condition that you need to think about in async environments.

e12e · on Oct 31, 2021

True, this is why database libraries/orms can be dangerous - the "normal" db way is to wrap 1,2 in a transaction. Unfortunately, while transactions are often available many db interfaces lures the unwary programmer away from using transactions, because reading/writing the db "looks like" reading/writing a variable/object.

djur · on Oct 31, 2021

You have to consider that in multiprocess scenarios as well, though. All forms of concurrency introduce that problem.

hibbelig · on Oct 31, 2021

The same thing could happen with any other form of concurrency. In this case you could wrap the two statements in a transaction with the appropriate isolation level. Not sure what it is called — read consistent? Snapshot?