To be entirely fair, there's arguably little overlap between devs that stick to ...

fulafel · on Oct 6, 2023

Yes, especially since the use case for virtual threads is so niche.

weatherlight · on Oct 6, 2023

Is this sarcasm, or just generally true in the java community?

bcrosby95 · on Oct 6, 2023

I can't think of a project I've worked on in the past 10 or so years that couldn't benefit from virtual threads. Right-sizing your thread pools in an environment with a lot of IO is/was a pain in the ass.

Many of the core concurrency constructs in Clojure has separate functions for when you are doing something that is blocking vs not blocking. If it were all virtual threads this distinction generally wouldn't matter.

Alupis · on Oct 6, 2023

Not just that - the kicker is everything that already exists (libraries, JDBC drivers, etc) are now just automagically non-blocking too if you seed the thread pool with a virtual thread factory.

I can't remember the last time such a transparent, drop-in-able change was so impactful for just about everything.

While it doesn't remove the need for the event/reactive/async paradigm in all cases, it does removes the need for async code just for the sake of being non-blocking.

Code can now be written much more clearly, and still be non-blocking. That's huge.

fulafel · on Oct 6, 2023

Neither for me (I'm in the Clojure camp), it just seems to me there aren't many situations where you'd need millions of threads. Plus we've got core.async to reach for to multiplex many control flows on one OS thread - though I feel it's more needed when targeting JS (ClojureScript) and the single-threaded world there.

weatherlight · on Oct 6, 2023

I'm coming at this from a Erlang/Elixir background , Where we basically have "virtual threads" but they aren't at the OS level. It's easy to use and making things concurrent (or parallel) is often trivial. Whether it's 1 extra virtual thread or millions of virtual threads, the code (and the horizontal scaling of that code is the same)

raspasov · on Oct 7, 2023

I love core.async (been using it since... 2014 I think... both Clojure and ClojureScript), and yes, core.async/go basically was/is solving (some) of the same problems. But there's a number of gotchas around (go ...) blocks...

Without being an expert on the core.async internals, I believe core.async can potentially benefit tremendously from this, by the virtue of being a powerful and super-elegant API for communicating between different parts of an application (typically within the same JVM process). Now it can continue to do that, while (likely) being free from most macro gotchas...

raspasov · on Oct 7, 2023

It's niche if you don't have much traffic/request volume/etc.

Once you have even an arguably relatively small amount (100s to 1000s of requests per second) it can be a game changer in terms of efficiency.

fulafel · on Oct 7, 2023

Are there published case studies or other empirical evidence available with numbers about this?

OS threads in Linux are fast and you can have a lot of them. Eg https://news.ycombinator.com/item?id=37621887

raspasov · on Oct 7, 2023

Just try starting ~100,000 threads that each sleeps for ~60 seconds in your JVM and check if you can succeed! :)

``` (run! (fn [x] (future (Thread/sleep 60000))) (range 100000)) ```

(hint: likely not...)

TLDR; Efficiency difference: you can think of virtual threads as being about 1,000 times (3 orders of magnitude) less expensive, in the general case. Exception: If you are only doing CPU-only work, regular threads will be better (that's not how most web servers/services operate).

But if you're waiting 100ms of ms for your database (or any network) to respond and you have many of those (blocking) method/function calls in flight... virtual threads are the way to go in terms of efficiency.

Great video explaining all of this (at timestamp, all 30 min is worth watching):

17:05 Making threads less expensive: by how much? https://www.youtube.com/watch?v=5E0LU85EnTI&t=1025s

fulafel · on Oct 7, 2023

On a wimpy old desktop running lots of other stuff I got 75000 threads with that snippet (had to increase max_map_count tunable first with sysctl -w vm.max_map_count=500000, a knob well documented for bigger thread counts). Considering that in a real world use case with that much concurrency (such as the "100k threads frequently waiting for 100ms db queries" scenario), I'd be using a bigger machine and there'd also be some actual application context data and TCP connection state dwarfing the thread memory requirements in those blocked contexts, I'll still call virtual threads solving quite rare use cases, pending stronger evidence.

I don't doubt virtual threads are efficient when microbenchmarked against OS threads, but there's only so much to be gained when optimizing a part of the system that wasn't a bottleneck to begin with.

(Also I hope in the scenario we have a beefy DB that can exploit the concurrency available in this many pending concurrent queries per client, and isn't just putting them in ever queue! I guess it could, maybe it has 1000 read replica servers, each with 100 attached SSDs or 100 cores serving from in-memory data)

raspasov · on Oct 8, 2023

Right... but using virtual threads I got 1,000,000 (in a second or two...) and likely for fraction of the memory. And I can even get many millions without a problem.

I agree that for a low volume typical CRUD-only server that never bursts above a few dozen requests per second that's not an issue.

Messaging is one special "rare" case.

Imagine a messaging channel in Slack/Discord/Telegram/WhatsApp-type app, where there's thousands of participants (say 10,000) in one channel. 1 person posts a message...

9,999 messages are generated (one to for each of the other participants). One might say, that can be handled with a few machines...

Now imagine a person posts something super interesting to the channel where 100 people respond with an emoji reaction almost instantly.

Now you have 100 * 9999 = 999,900 (!) messages, all within a few seconds! And this is for just one message _reaction_. Each one of those messages likely involves multiple steps of IO (sending/receiving data over the socket, writing to one or multiple databases...). So a thread might be blocked for hundreds of milliseconds at various times, which would ultimately likely require 100-1000x more servers to operate as scale increases.

The thing is that you _can_ write a program efficiently using regular threads by using asynchronous techniques. Another name for such style of programming is _callback hell_ :) . And when something does go wrong, it is a lot harder to debug.