Hacker News new | past | comments | ask | show | jobs | submit login

> Synchronous I/O by default means you're wasting your CPU cores

When a process becomes blocked (like due to synchronously waiting for IO to complete), the kernel will context switch away to another runnable process. The process that becomes blocked essentially goes to sleep, and then "wakes up" and becomes runnable again once the operation completes. The CPU is free during this time to run other processes.

If you need to do multiple different of these things concurrently, then you can run multiple processes. Writing a single process with async code won't make that process faster. To do more things at the same time you can run multiple processes. Context switching between different processes is what the kernel scheduler is designed to do, and it does so very efficiently. There isn't much overhead per thread. If I recall correctly, Linux kernel stacks per thread are 8 kilobytes (with efforts under way to reduce that further [4] - also discussed in [1]), and the user stack space is something the application can control and tune. The memory use per thread needn't be much.

Using all available cores to perform useful work is the most important thing to achieve in high-throughput code, and both async and sync can achieve it. Async doesn't become necessary for high performance unless you're considering very high performance which is beyond the reach of NodeJS anyway [2]. Asynchronous techniques win on top performance benchmarks, but typical multithreaded blocking synchronous Java can still handily beat async NodeJS, since Java will use all available CPU cores while Node's performance is blocked on one (unless you use special limited techniques). There's some good discussion about this in the article and thread about "Zero-cost futures in Rust" [1]. The article includes a benchmark which compares Java, Go, and NodeJS performance. These benchmarks suggest that the other tested platforms provide 10-20x better throughput than Node (they're also asynchronous, so this benchmark isn't about sync/async).

Folks might also be interested in the TechEmpower web framework benchmark [2]. The top Java entry ("rapidoid") is #2 on the benchmark and achieves 99.9% of the performance of the #1 entry (ulib in C++). These frameworks both achieve about 6.9 million requests per second. The top Java Netty server (widely deployed async/NIO server) is about 50% of that, while the Java Jetty server, which is regular synchronous socket code, clocks in at 10% of the best or 710,000 R/s. NodeJS manages 320,000 R/s which is 4.6% of the best. In other words, performance-focused async Java achieves 20x, regular asynchronous Java achieves 10x, and boring old synchronous Jetty is still 2x better than NodeJS throughput. NodeJS does a pretty good job given that it's interpreted while Java is compiled, though Lua with an Nginx frontend can manage about 4x more.

I agree that asynchronous execution can provide an advantage, but it's not the only factor to consider while evaluating performance. If throughput is someone's goal, then NodeJS is not the best platform due to its concurrency and interpreter handicap. If you value performance then you'll chose another platform that offers magnitude better requests-per-second throughput, such as C++, Java, Rust, or Go according to [1] and [2]. Asynchronous execution also does not necessarily require asynchronous programming. Other languages have good or better async support -- for example, see C#'s `await` keyword. [3] explores async in JavaScript, await in C#, as well as Go, and makes the case that Go handles async most elegantly of those options. Java has Quasar, which allows you to write regular code that runs as fibers [5]. The code is completely normal blocking code, but the Quasar runtime handles running it asynchronously with high concurrency and M:N threading. Plus these fibers can interoperate with regular threads. Pretty gnarly stuff (but requires bytecode tampering). If async is your preference over Quasar's sync, then Akka might be up your alley instead [6].

> By using the 'co' library and generators you can write async code almost as if it were sync.

For an interesting and humorous take on the difficulties of NodeJS's approach to async, and where that breaks down in the author's opinion, see "What color is your function?" [3].

[1] https://news.ycombinator.com/item?id=12268988

[2] https://www.techempower.com/benchmarks/#section=data-r11&hw=...

[3] http://journal.stuffwithstuff.com/2015/02/01/what-color-is-y...

[4] https://lwn.net/Articles/692208/

[5] http://docs.paralleluniverse.co/quasar/ [6] http://akka.io/




Node isn't interpreted, it runs on V8 which is a fairly advanced JIT compiling VM.

The main reason Node is so slow is that, as you point out, it doesn't do threading at all, and JavaScript is just inherently difficult to optimise into machine code, even though V8 makes a good effort.


The benchmark code uses multiple processes to utilize all CPU cores (standard way in Node.js land).

I would blame the highly dynamic nature of the language for (relative) slowness – the price you pay for flexibility / quickly-writable code. That being said, it's still fast for dynamically typed language.


What do you think a VM is?

It does not target your hardware architecture. Just because your code is in binary instead of text, it does not become less of an interpreter.


https://developers.google.com/v8/design

C-f for "Dynamic Machine Code Generation"

V8 compiles it to native machine code for execution. It's not an interpreter or a VM with its own bytecode.


Cool. There is no VM, just a runtime not much different from the ones from other compiled languages. It compiles your code piece by piece, and optimizes during that compilation.

This is a very interesting solution for JavaScript, but I think it's constrained by the language and could become better if it were fully compiled.


Perhaps for the use-case of node, being fully compiled could be a bit better. But I'm not convinced. Java has demonstrated (over two decades now) that initial compilation to machine code isn't essential for performant software (ok, less than two decades for hotspot JITs in Java). The cost for node is in startup time, but this is, like with Java, amortized over the life of the program. For a single-run executable, it's perhaps too costly (but we use truly interpreted languages for this as well, so depends on the task), but for a server, it's potentially pretty good.


Yes, that was for the specific case of Node. For web use, full compilation is a clear loss.

Java can make great use of JIT exactly because it's interpreted. When you make partial compilations, you lose a lot of JIT opportunities, and a lot of static optimization opportunities too. Yet, it looks like V8 attenuated this somehow, and got good performance anyway. Must have been a feat.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: