Hacker News new | past | comments | ask | show | jobs | submit login

There may be some huge piece I'm missing, but how exactly is the "starting multiple async tasks wrapped in blocks and waiting for them to finish at the end of the main Async block" different from "starting multiple threads wrapped in blocks and manually collecting them at some point"? I thought Ruby did release the GIL when a thread is blocked (waiting for IO etc).



The difference is just that fiber overhead is lower so you can run more fibers than threads on a given system. Even though fibers have been around a while, people rarely used them because they are cooperative rather than preemptive, so you had to manually write the scheduling logic. Much easier to just use threads.

I think the big breakthrough for Ruby Async is that fiber scheduler in Ruby 3.0 now makes it possible for the runtime to manage fibers in a less manual way, so you now get the lightweight option more easily. The Async gem seems to be wrapping all that up in a very nice interface so you can write simple code and get good concurrency without much effort.


- Async Ruby is much more performant than threads. There are less context switches, enabled by the event reactor. The performance benefits are visible in simple scenarios like making a thousand HTTP requests.

- Async is more scalable, can handle millions concurrent tasks (like HTTP connections). There can only be a couple thousand threads at the same time.

- Threads are really hard to work with - race conditions everywhere. Async doesn't have this.


> - Threads are really hard to work with - race conditions everywhere. Async doesn't have this.

Having worked a lot with various flavours of async (I was one of the many people in the loop for the design of DOM Promise and JavaScript async), I regret that, while many developers believe it, *this is generally false*.

In languages with a GIL or run-to-completion semantics, of course, you get some degree of atomicity, which is a nice property. However, regardless of the language, once you have async, you have race conditions and reentrancy issues, often without the benefit of standard tools (e.g. Mutex, RwLock) to solve them [1].

Ruby's async syntax and semantics look neat, and I'm happy to see this feature, but as far as I can tell from the examples in the OP, they're going to have these exact same issues.

[1] Rust is kind of an exception, as its type system already forces you to either &mut/Mutex/RwLock/... anything that could suffer from data race conditions (or mark stuff as unsafe), even in async code. But even that is because of the race conditions and reentrancy issues mentioned above.


> I regret that this is generally false. Once you have async, you have race conditions and reentrancy issues

Can you get more specific please?

My experience is 100% from Ruby where I've worked heavily with threads in the past, and with Async Ruby for the past 18 months.

From what I can tell, threads require a great deal of locks (mutexes) and are frustratingly hard to get right - because of the language-level race conditions.

Async Ruby has been refreshingly easy to write. I have yet to encounter an example of a race condition.

If it helps to bridge the language gap here: from what I know Async Ruby is similar to Go's goroutine model.


Here's a short toy example in pseudo-syntax.

Task 1:

global.a = 1

await async {} // Await a block that does nothing, essentially yielding back to the reactor.

print(global.a) // If you're unlucky, global.a has changed.

Task 2:

global.a = 2

await async {} // Await a block that does nothing, essentially yielding back to the reactor.

print(global.a)

Now enqueue both tasks.

Depending on scheduling, you could end up printing (1, 1), (1, 2), (2, 1) or (2, 2). This can become much worse if you're awaiting in the middle of filling a data structure.

Feel free to replace `await async {}` with querying your database or a remote API and `global.a` with any variable or data structure that can be shared between the tasks.

This example is, of course, a toy example, but I've had to debug sophisticated code that broke non-deterministically because of such behaviors. The source code of the front-end of Firefox (written in JavaScript) is full of hand-rolled kinda-Mutex implementations to avoid these race conditions.


Thank you for the clarification. You are right, these types of race conditions are possible with Async Ruby.

I find these races relatively easy to spot: yes, global state CAN change when you yield back to the reactor.

IMO thread race conditions are much, much worse. Global state can change AT ANY POINT, because thread scheduler preemptively switches threads. Here's an example:

    global.a = 0

    thread {
      global.a = 1
      print(global.a) // a is 1 or 2
    }

    thread {
      global.a = 2
      print(global.a) // a is 1 or 2
    }

    print(global.a) // a is 0, 1 or 2


> I find these races relatively easy to spot: yes, global state CAN change when you yield back to the reactor.

It's good that you can find them easily. In my experience, these changes can creep into your code stealthily and only hit you from behind, months later, when the winds shift.

Some of my traumas include:

- global state that ends up captured by a closure while you didn't realize it was (perhaps because the code you wrote and the code the other dev wrote accidentally shared state);

- mutating hashtables/dictionaries while walking them, without anything in your code looking suspicious;

- and of course accidentally mixing concurrency between two reactors, but luckily, that's not something that happens every day.

> IMO thread race conditions are much, much worse.

IMO, thread race conditions are a case of "it's bad but you knew what you'd get when you signed up for it", while async race conditions are a case of "don't worry, everything is going to be fine (terms and conditions apply, please contact your lawyer if you need clarifications)".


This isn't really an issue with threads though, the exact same issue is present in green thread/fiber implementations; it just so happens that in Async Ruby the GIL saves you from this specific problem due to making variable accesses atomic (as I understand it anyway, I'm not super familiar with the Ruby VM).

In general, green threads/fibers are vulnerable to the exact same shared memory issues as threads, the only benefit to them is that they are, for a certain class of problems, a more efficient concurrency primitive than threads by avoiding context switches out of userspace, and in many cases provide you the ability to plug in your own scheduler if you so desire allowing you to optimize scheduling for your own workload.


Thank you for sharing your perspective in a non-conflicting way Yoric.

It seems you have more experience than me with async-related work. I'll keep an eye on this kind of scenario you brought up.


Here's a working race condition with async:

https://gist.github.com/39409838626b22433f8bf1878276b3c1

Prints 1 instead of 1000

If you remove the sleep there's no race because the fiber scheduler never hits a blocking call anywhere, but the sleep stands in for some kind of i/o. And then there's the terrible global variable.

(Conversely this does show you how badly you have to write code to get race conditions with async and the sleep there is important to the generation of the race, which would not be necessary with threads)

And I think if you remove the sleep it doesn't race due to the GIL and single-threadedness of ruby for pure userspace computations?


If it did not print 1 then you'd have a problem.

Not really a race condition, just the semantics of async. You'll get the same with goroutines, javascript async/await, and other constructs.


You avoid race conditions of writing to the same variable but you still can't avoid race conditions where the critical section is longer than 1 statement

e.g. in JS:

    foo = await fetch()
    bar = await fetch()
    doSomething(foo, bar)
You can't be certain foo isn't modified while the second fetch is executing (assuming foo is globally scoped)


There’s still an important difference in that you’re yielding control explicitly by using await, instead of being preemptable everywhere. That’s good enough in most cases.


The problem with function coloring is that it tends to proliferate throughout APIs (since you need to be async to call async), so as things evolve you find yourself needing to await the majority of function calls… this makes it kind of tough to avoid awaiting during critical paths.

That said I still agree that it’s easier to manage than Threads, but something tells me we’re still going to want structured-concurrency versions of the common primitives like rw locks and mutexes, even in async environments.


In any language, avoiding race conditions or unexpected values changes without dedicated primitives or special conditions is relatively contingent on preventing thoughtless non-read usage of "shared" memory.

The problem is that it's often relatively easy to make unsafe writes without realizing, especially since the guarantees of "safe" primitives can be misunderstood. And of course many people don't realize that async code can have race conditions because they don't really understand the details of how async even works, or that some languages make extra guarantees that they're unknowingly depending on.

Having candidates explain the difference between parallel and asynchronous has been a relatively effective first level interview screener for me, especially with more junior roles.


> Async Ruby is similar to Go's goroutine model.

Goroutines are just (lightweight) threads - they can run in parallel. You need locks whenever you access shared data.

AFAIK Ruby’s fibers are cooperatively scheduled and can only yield at function calls. Code that doesn’t make a function call (e.g. incrementing an integer) is safe to run on multiple fibers without locks.

For comparison, Python’s stackless coroutines can safely do anything except `await` without requiring locks.


This is an issue I've had with NodeJS recently.

    1. Check for existence of row by a key in a Postgres table
    2. If not present, create one
You can have a race condition to 2. You could do the insertion itself as the check to avoid this issue. But regardless this is a race condition that you need to think about in async environments.


True, this is why database libraries/orms can be dangerous - the "normal" db way is to wrap 1,2 in a transaction. Unfortunately, while transactions are often available many db interfaces lures the unwary programmer away from using transactions, because reading/writing the db "looks like" reading/writing a variable/object.


You have to consider that in multiprocess scenarios as well, though. All forms of concurrency introduce that problem.


The same thing could happen with any other form of concurrency. In this case you could wrap the two statements in a transaction with the appropriate isolation level. Not sure what it is called — read consistent? Snapshot?


Exactly. At some point you realize that explicit callbacks, async, coroutines, threads are all related.


> I regret that, while many developers believe it, this is generally false.

The strongest evidence of a golden hammer is a lot of red thumbs.


Can you please make or link to one example? I'd be interested to learn more.



> There can only be a couple thousand threads at the same time.

You can easily have tens of thousands of threads on Linux. Beyond 50 000 or so you may need to adjust some settings using `sysctl`, but after that you should be able to push things much further.

Task counts themselves are also a pretty useless metric. Sure, you can have may fibers doing nothing. But once they start doing something, that may no longer be the case (this of course depends on the workload).

> Threads are really hard to work with - race conditions everywhere. Async doesn't have this.

You can still have race conditions in async code, as race conditions aren't limited to just parallel operations.


> Task counts themselves are also a pretty useless metric. Sure, you can have may fibers doing nothing.

Sorry if I wasn't explicit. I was talking about tasks/fibers performing actual work, like handling HTTP connections.

It's practically possible for Async Ruby programs to do work with hundreds of thousands Async Tasks (concurrent fibers).

Some users have worked with millions of Async tasks, but I'm not sure if it was practical work, or proof of concept.

> You can easily have tens of thousands of threads on Linux.

Thank you for the correction about threads on Linux. I'm not sure if you're talking about threads in general or threads in Ruby?

I've only lightly tested increasing the number of threads in Ruby to about a thousand, and my test code was very slow.

I think that has to do with thread switching overhead. Threads have a relatively high switching overhead, so I don't think it's advisable to run more than 100 threads - in Ruby at least.


> I think that has to do with thread switching overhead. Threads have a relatively high switching overhead, so I don't think it's advisable to run more than 100 threads - in Ruby at least.

If you run many threads the bytecode VM of all threads are contending for the GIL in order to advance the program.


While true, the same would apply to code using Async blocks. Async does not attempt (or intend) to enable parallel execution of VM byte code.

If the parent commenter saw a speed up in their tests, it’s reasonable to assume that is because of the difference in overhead of switching threads vs fibers.


Fibers don't contend on the GIL.


> race conditions everywhere. Async doesn't have this

What does Async (Fibers underneath) do differently than normal Threads? Using threads to handle concurrent work doesn't immediately bring race conditions, unless the programmer explicitly creates them (accessing the same stuff from different threads).

Fibers themselves AFAIK don't stop you from accessing the same stuff, aside from the obvious "the fiber code runs 'atomically' until it hits the next yield (which non-blocking Fibers take away anyway).


> What does Async (Fibers underneath) do differently than normal Threads?

There's a lot to say on this topic.

- Threads implement "preemptive scheduling". A scheduler switches control from one thread to another every 10ms or so. A thread running the code may be ready for the switch or not. The ensuing race conditions are nasty.

- Async + Fibers implement "cooperative scheduling". The currently running fiber (voluntarily) yields control to another fiber when it's ready. The result is there are no race conditions.

There's so much to say about this, I'll blog about this in the future.


> The currently running fiber (voluntarily) yields control to another fiber when it's ready. The result is there are no race conditions.

This is generally untrue, as it assumes the developer knows the ramification of which methods hit scheduling points and understand what state the global set of potentially pending work might modify while suspended.

You might not have to focus so much about concurrent threads modifying the same memory simultaneously, but you absolutely could have the value change unpredictably during a write-sleep-read.

Note there are environments which try to make it more obvious which methods might result in a suspension, and have the developer acknowledge that so that the code remains understandable/maintainable. The keywords used for this are typically 'async' and 'await'.


Yep… this is why the “thank goodness these functions aren’t colored!” people confuse me. Colored functions are a very good thing, they make it explicit where context switches happen and make understanding async interactions easy.

People who don’t like “colored functions” IMO are similar to folks who don’t like types. They want to be able to change something to be async without a compiler yelling at them to go through all of its call stacks and ensure they can handle the asynchrosity, similar to changing a function to return “null” sometimes and not wanting a compiler to make them verify all calling code can handle a null.

That being said, I do wish all functions could be colored. The most painful async migrations are when calling code happens in a constructor, which in JS cannot be made async.


There’s value in being able to write code without the ceremony of types or declarations or compulsory exception declarations and the like, just like there’s value in having tools like a REPL.

The problem comes in that it takes a lot of discipline and understanding of both your code and dependencies to successfully manage projects without those safety rails. They are also extraordinarily difficult to add-on later.

I’ve written small reverse proxies in say Node.js which saved me days of time over doing so in something like Rust. I’ve also hit errors in Node.js code which have made me want to give up on technology and live in a cave.


Yes, the sweet spot IMO is gradually typed languages. I love how I can have TS code that 100% does not type check and the compiler will yell at me all it wants but still produce the compiled JS without a problem. It's liberating to say "yes, thanks for letting me know that at the moment this will absolutely not work in X edge case or when called from Y context with Z data, but I don't care about that right now just let me run it as-is to make sure my general approach is correct".

Even better, in large projects when there is a constant drift of dependencies, I often will pull in the latest main for dev work and see that certain modules aren't found or have been updated and are now being called in a way that isn't compatible with my version. I can look at those errors and decide if the relevant changes will impact my area, if so I go through the full rebuild, otherwise I just ignore them and let TS yell at me.

I guess the overall theme is that I like it when the compiler doesn't think it's smarter than me. Compilers that say "no, you can not build this, you must resolve these issues before I will let you proceed" are much more painful to work with than ones that say "hey watch out, this particular area will probably not work as you expect. Feel free to try out the build, but you really ought to fix that before committing"

Edit: looking back on this and my original comment, I see how they're somewhat in opposition! It would seem I like the compiler to forbid me from shooting myself in the foot with concurrency, but not from shooting my self in the foot with types/dependencies.


Cooperative scheduling means that every function call is potentially a scheduling point, so it isn't really a significant change from threads. Sure, if you are careful and only call functions that are not guaranteed to preempt then you are safe, but critical sections make this explicit.

And if you are running code on multiple cores then lack of preemption doesn't help anyway.


I have the same question as parent and I'd love to read more about this distinction. Please do blog about it.


Is the scheduler only using a single OS thread? It's not M:N scheduling?


I assumee unless you have a GIL, most operations on collections are not thread safe, but they are async safe, since context switching can only happen where you await.


That's true for stackless coroutines where you can only yield from the top, but I understand that ruby has full coroutines/fibers


Please, someone, correct me if I've misunderstood.

The big difference appears to be that async Ruby does not merely give you an easy sugar to perform the sync-over-async antipattern you have described. The real innovation is that, as far as the user is concerned, Ruby is magically turning blocking methods into non-blocking ones.


That's basically how I'm thinking of things as well. To illustrate a bit further, consider the following:

Given a blocking method call `foo(x)`, I can make it non-blocking by wrapping it in a "thunk" as `λx.foo(x)`.

Where things start to get interesting is when I add another method call `foo(x) + bar(x)`. Now to keep things "async" I need to transform the abstraction into something more like `λx.foo(x) + λx.bar(x)`, and have the `+` call dispatch both fibers and wait for them before performing its operation.

Doing this automatically seems pretty cool, I'll have to think about this a bit more sometime.


That's pretty accurate.


The difference only shows itself in the real world, when you do a bit more per thread/coroutine and end up mutating shared state. This is where threads can lead to race conditions, whereas coroutines will not (unless you're basically ask for it)


It looks like from a brief glimpse into the source code that the library is using Ruby Fiber's under the hood instead of Threads.

From my limited understanding, the programmer has to be explicit about starting and resuming the work within a Fiber as opposed to the Ruby VM.


Yes, async is using fibers as a concurrency primitive.

Async gem is starting fibers automatically. Fiber pausing and resuming is handled by event reactor (also called "event loop") from 'nio4r' gem.

io_uring support with asynchronous File IO is also on the way.


So the Async gem allows the programmer to fire "tasks" and wait for them to finish (the same way we can fire threads), but instead of OS-level threads (which is what MRI uses for Threads) it uses a new kind of Fibers, called "non-blocking Fibers", that are lightweight and don't use OS threads, like normal Fibers, but unlike normal Fibers they yield automatically to the scheduler when blocked (sort of like threads).

Is this a correct-ish way to describe the current state of affairs?


I think you're generally on the right track.

Small correction: fibers (blocking or non-blocking) never used threads in any way. Fibers are "stackful coroutines", another concurrency primitive.

They're pretty hard to understand, and unfortunately I can't find a good explanation of "Ruby Fibers" that I can link to.


Awesome! Thanks for the article, can't wait to see this library progress!


Ruby's Global Interpreter Lock (GIL) is applicable, and not "sidestepped" with Async.


Yeah, it's unclear to me whether you'd have to "join" to get the result from an operation, or if you can have some form of "await" expression


Apparently the tasks are "joined" automatically before the enclosing "Async do" block is allowed to finish.


Threads use quite much more memory than coroutines. Spawning eg. 3 threads for each request, x 1000 requests per second would probably eat a ton of memory.


Stackful coroutines (often used to implement colorless async) is the same except you can specify stack size. You can specify thread stack size yourselves as well, though no one does that.

OTOH, growable stack is useful, as Go demonstrated.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: