The article talks about IO, but then benchmarks sha256. The node program doesn't work well because he's synchronously doing this:
for (var i = 0; i < n; i++) { sha256(data); }
He should be using crypto and the async versions. This benchmark actually is just measuring the speed of your sha256 implementation, which I would guess is equal on all 4 platforms if you actually do them correctly.
Pet peeve: tech articles/blogs which do not prominently display when the article was written. This is the single most important fact I search for before reading time sensitive tech info.
I can't tell when this was written, am I missing something?
Yeah, super annoying. They likely have some content marketing campaign that shares old posts at various intervals.
Looks like it was published 6 months ago - 'Thu, 11 May 2017 09:14:20 -0400' to be precise.
Source: The /rss feed for the blog, which shows the following for this post.
<item>
<title>Server-side I/O Performance: Node vs. PHP vs. Java vs. Go</title>
<description>Understanding the Input/Output (I/O) model of your application can mean the difference between an application that deals with the load it is subjected to, and one that crumples in the face of real-world uses cases. Perhaps while your application is small and does not serve high loads, it may matter far less. But as your application’s traffic load increases, working with the wrong I/O model can get you into a world of hurt.</description>
<pubDate>Thu, 11 May 2017 09:14:20 -0400</pubDate>
<link>https://www.toptal.com/back-end/server-side-io-performance-node-php-java-go</link>
<guid isPermaLink="false">server-side-io-performance-node-php-java-go</guid>
<dc:creator>BRAD PEABODY, DEVELOPER @ TOPTAL</dc:creator>
<media:content url="https://uploads.toptal.io/blog/post_image/120386/02-facebook-1200x627-494cb5ae75d16fa39251cb8b7d6ae877.jpg" medium="image"/>
</item>
Nice sleuthing! I'm inclined to agree with your assessment about content marketing campaigns being regurgitated over time. Seems they are explicitly choosing to not show the article date in an effort to get more traffic.
Same happened to me. And there was no way to close it. I did a hard refresh and it loaded again without the popup - I'm guessing it's on a time delay? Either way, super annoying.
Disable javascript and you won't see any ads. Sometimes you don't see anything at all, true, but in those cases the close- or back-button quickly solves your problems.
I'm curious, which Java servers are creating a new Thread per request?
Tomcat uses acceptors threads and a request executor pool. And, if available on your platform (which it probably is), it defaults to using non-blocking IO instead of polling.
EDIT: It looks like he does acknowledge the executor threads are pooled. His main criticism is that "too many blocked threads are bad for a scheduler". But if Tomcat is using the NIO connector this doesn't apply, because your executor threads won't block on IO. And typically the pool size is limited to something manageable by a modern CPU (200 or so)
Yeah that perplexed me as well. There's no reason you can't get a modern Java app server to do async io coupled with thread pools and achieve performance close to Go. Maybe Go's coroutines vs Java native threads may give Go a little advantage but it shouldn't be a great difference for sanely designed applications.
Ummmm when you care about I/O performance and your building anything of substance (something more than what nginx can do on it's own) why would you choose PHP? The security, fragility and un-maintainability alone are enough reasons to avoid it.
I'm not promoting php's use, just pointing out how benchmarking against a years old of anything is pretty useless. Which is another indicator the article was of low value and interest, providing no real, meaningful insight into those different languages.
I visited the post fully expecting the author would benchmark the latest PHP 7.2. To my surprise, he wasn't even benchmarking 7.0, but the old 5.6. I can't take this test seriously.
In terms of the article, pretty graphs and good explanation at first but extremely flawed at running the actual test against these technologies. Wouldn't recommend.
The example of async I/O in Node.js is with a filesystem operation.
How does that work? AFAIK Linux, unlike windows, doesn't have a proper async API for filesystem I/O.
POSIX AIO is implemented using threads in libc and Linux AIO (io_submit, etc) only works with O_DIRECT (no kernel caching) and there are other issues depending on underlying filesystem driver.
> The libuv filesystem operations are different from socket operations. Socket operations use the non-blocking operations provided by the operating system. Filesystem operations use blocking functions internally, but invoke these functions in a thread pool and notify watchers registered with the event loop when application interaction is required.
I'm not familiar with libuv but is there a dedicated "watcher" thread in a thread pool that the kernel wakes up when when the disk driver has completed that I/O that it was sleeping on? If so is there a dedicated "watcher" thread for each non-blocking I/O request handled by libuv?
There is no watcher: threads get parked by the kernel as "not runnable" while they perform the I/O operation.
When the operation is finished, then they become runnable again so the kernel scheduler runs them--that is, it runs the code that comes after the I/O operation.
This code (from libuv) notifies the requester that the operation is complete.
Thanks. So is the libuv package implemented as two distinct components then consisting of:
1) The main event loop thread.
2) a thread pool of "workers" which make the actual I/O request and sleep while waiting for the I/O. Then when a "worker" thread is set to runnable again by the kernel(because it's I/O request has been completed)notifies the main event loop thread via a signal?
While still suffering from the problem of benchmarks not reflecting real world performance, The Computer Language Benchmarks Game is an excellent resource.
He's using the slowest and oldest framework on Java which still is preferable to use if you want things easy to read and maintain. But if you're going for raw I/O performance and already taking the readability and maintenance challenges with the other platforms, then he really should be using Netty, Vert.x or Akka for Java.
No, it's not the only thing that matters. Java's compiled too and was included here, but Go still beat it out.
It's likely that C or Rust or C++ would be much faster than Java or Go since they have large runtime overheads; it turns out having a runtime and how heavy it is matter too, not just if it's compiled.
There are also cases where non-compiled languages can beat out compiled ones. For example, lua with LuaJIT is incredibly fast, and beats out compiled languages with heavy runtimes (like java/C#) in quite a few microbenchmarks.
Java is compile to bytecode which is then interpreted by a virtual machine, it's not compiled to native code, that's why you cannot distribute Java programs as executable like go.
Is there a way to bundle the JVM (or the parts you need) with the Java app you'd like to run? Erlang (and Elixir) does this with the Erlang VM (BEAM), allowing you to bundle VM+app as a release. The Erlang release must be compiled for the platforms on which that release will be run. It's not quite as elegant as a Go binary, but it's still pretty handy.
Java has a specification and has lots of implementations.
The majority of commercial implementations of Java have native code compilers, and Oracle is in the process of improving the AOT compiler introduced in Java 9.