Hacker News new | past | comments | ask | show | jobs | submit login
Server-Side I/O Performance: Node vs. PHP vs. Java vs. Go (toptal.com)
69 points by vgallur on Dec 13, 2017 | hide | past | favorite | 54 comments



The article talks about IO, but then benchmarks sha256. The node program doesn't work well because he's synchronously doing this:

  for (var i = 0; i < n; i++) { sha256(data); }
He should be using crypto and the async versions. This benchmark actually is just measuring the speed of your sha256 implementation, which I would guess is equal on all 4 platforms if you actually do them correctly.


Where did you even find the source code?



Pet peeve: tech articles/blogs which do not prominently display when the article was written. This is the single most important fact I search for before reading time sensitive tech info.

I can't tell when this was written, am I missing something?


Yeah, super annoying. They likely have some content marketing campaign that shares old posts at various intervals.

Looks like it was published 6 months ago - 'Thu, 11 May 2017 09:14:20 -0400' to be precise.

Source: The /rss feed for the blog, which shows the following for this post.

    <item>
      <title>Server-side I/O Performance: Node vs. PHP vs. Java vs. Go</title>
      <description>Understanding the Input/Output (I/O) model of your application can mean the difference between an application that deals with the load it is subjected to, and one that crumples in the face of real-world uses cases. Perhaps while your application is small and does not serve high loads, it may matter far less. But as your application’s traffic load increases, working with the wrong I/O model can get you into a world of hurt.</description>
      <pubDate>Thu, 11 May 2017 09:14:20 -0400</pubDate>
      <link>https://www.toptal.com/back-end/server-side-io-performance-node-php-java-go</link>
      <guid isPermaLink="false">server-side-io-performance-node-php-java-go</guid>
      <dc:creator>BRAD PEABODY, DEVELOPER @ TOPTAL</dc:creator>
      <media:content url="https://uploads.toptal.io/blog/post_image/120386/02-facebook-1200x627-494cb5ae75d16fa39251cb8b7d6ae877.jpg" medium="image"/>
    </item>
Source: https://www.toptal.com/developers/blog.rss


Nice sleuthing! I'm inclined to agree with your assessment about content marketing campaigns being regurgitated over time. Seems they are explicitly choosing to not show the article date in an effort to get more traffic.


Thanks. Yes that would be my guess also. They are aggressively marketing their platform here on HN, and even in some of the HN based newsletters.


I have closed articles before because there is no date. Not about to waste my time with something that could potentially be years old.


My best guess would be 7 months ago (~ May 11th 2017), since most comments were made at that time.


I was reading the article until a giant fullscreen ad popped up and I had to close the tab. Sorry, no thanks.


Same happened to me. And there was no way to close it. I did a hard refresh and it loaded again without the popup - I'm guessing it's on a time delay? Either way, super annoying.


Disable javascript and you won't see any ads. Sometimes you don't see anything at all, true, but in those cases the close- or back-button quickly solves your problems.


That was technically an ad inside an ad. The article is clearly an example of content marketing.


I'm curious, which Java servers are creating a new Thread per request?

Tomcat uses acceptors threads and a request executor pool. And, if available on your platform (which it probably is), it defaults to using non-blocking IO instead of polling.

EDIT: It looks like he does acknowledge the executor threads are pooled. His main criticism is that "too many blocked threads are bad for a scheduler". But if Tomcat is using the NIO connector this doesn't apply, because your executor threads won't block on IO. And typically the pool size is limited to something manageable by a modern CPU (200 or so)


It does seem the author has way outdated information or set up the benchmark to tilt the balance in certain way.

A more comprehensive benchmark is done here. https://www.techempower.com/benchmarks/#section=data-r14&hw=...


Yeah that perplexed me as well. There's no reason you can't get a modern Java app server to do async io coupled with thread pools and achieve performance close to Go. Maybe Go's coroutines vs Java native threads may give Go a little advantage but it shouldn't be a great difference for sanely designed applications.


Netty would be one possible option, but he selectively left out everything that could jeopardize Go's victory.


Yah, the Netty model is more in-line with Go's programming model.


So he acknowledges that has not benchmarked the best options for Java and then selects Go as a winner, what a joke.


Reminds me of a "real world" web language shootout I once read, which included accessing a database populated with a large dataset.

The author used a query along the lines of:

    SELECT * FROM table ORDER BY RAND() LIMIT 1
Except for the language they wanted to win, which was basically written like:

    SELECT * FROM table WHERE id = RAND()
Which, of course, resulted in it their favorite being the clear winner.


Same with Php, using really old versions of it. This is a waste of time article, nothing is learned from it appart the bias of the author.


Ummmm when you care about I/O performance and your building anything of substance (something more than what nginx can do on it's own) why would you choose PHP? The security, fragility and un-maintainability alone are enough reasons to avoid it.


I'm not promoting php's use, just pointing out how benchmarking against a years old of anything is pretty useless. Which is another indicator the article was of low value and interest, providing no real, meaningful insight into those different languages.


The ease of use chart is a joke:

    Java: Requires callbacks
    nodejs: Requires Callbacks
node has 'await', but there are other options as well than just plan callback hell.

Java has pretty much whatever you want. Akka, futures/promises, etc.


>Before I get into the section for Go, it’s appropriate for me to disclose that I am a Go fanboy

Yes, it shows...


I visited the post fully expecting the author would benchmark the latest PHP 7.2. To my surprise, he wasn't even benchmarking 7.0, but the old 5.6. I can't take this test seriously.


Same applies to his Java related choices.


In terms of the article, pretty graphs and good explanation at first but extremely flawed at running the actual test against these technologies. Wouldn't recommend.


Agreed.


PHP 5.6.x and old version of Node too - this article is too old to be of serious use.


The example of async I/O in Node.js is with a filesystem operation.

How does that work? AFAIK Linux, unlike windows, doesn't have a proper async API for filesystem I/O.

POSIX AIO is implemented using threads in libc and Linux AIO (io_submit, etc) only works with O_DIRECT (no kernel caching) and there are other issues depending on underlying filesystem driver.

Is there any solution?


https://nikhilm.github.io/uvbook/filesystem.html

> The libuv filesystem operations are different from socket operations. Socket operations use the non-blocking operations provided by the operating system. Filesystem operations use blocking functions internally, but invoke these functions in a thread pool and notify watchers registered with the event loop when application interaction is required.


I'm not familiar with libuv but is there a dedicated "watcher" thread in a thread pool that the kernel wakes up when when the disk driver has completed that I/O that it was sleeping on? If so is there a dedicated "watcher" thread for each non-blocking I/O request handled by libuv?


There is no watcher: threads get parked by the kernel as "not runnable" while they perform the I/O operation.

When the operation is finished, then they become runnable again so the kernel scheduler runs them--that is, it runs the code that comes after the I/O operation.

This code (from libuv) notifies the requester that the operation is complete.


Thanks. So is the libuv package implemented as two distinct components then consisting of:

1) The main event loop thread.

2) a thread pool of "workers" which make the actual I/O request and sleep while waiting for the I/O. Then when a "worker" thread is set to runnable again by the kernel(because it's I/O request has been completed)notifies the main event loop thread via a signal?

Is that correct?


That's correct, except the notification back is probably via some queue (think of futures or Go channels) rather than signals.


I see that makes sense. Cheers.


I’m not familiar at that level.


Yes a user space thread pool which is what node.js does on top of libuv.


While still suffering from the problem of benchmarks not reflecting real world performance, The Computer Language Benchmarks Game is an excellent resource.

http://benchmarksgame.alioth.debian.org


> … not reflecting real world performance.

Can you show that is true?


He's using the slowest and oldest framework on Java which still is preferable to use if you want things easy to read and maintain. But if you're going for raw I/O performance and already taking the readability and maintenance challenges with the other platforms, then he really should be using Netty, Vert.x or Akka for Java.


This post seems to be from about 7 months ago. Do the 'turbofan' changes of v8/v9 make any difference to the benchmarks?

https://www.nearform.com/blog/node-js-is-getting-a-new-v8-wi...


What a lousy article.

"Most Java web servers work by starting a new thread of execution for each request that comes in "

Nobody who has a performance critical application starts a new thread for every request.

That would be beyond stupid.


He’s using Apache in prefork mode with mod_php, which is a little bit 10 years ago. Worker or Event with php-fpm is standard now.


Only Java can share memory between threads.


Line for line Go will be faster it is compiled.

It is the only thing that matters line for line.


Line for line benchmarks are what-the-fuck-are-you-even-trying-to-do level cringe.


No, it's not the only thing that matters. Java's compiled too and was included here, but Go still beat it out.

It's likely that C or Rust or C++ would be much faster than Java or Go since they have large runtime overheads; it turns out having a runtime and how heavy it is matter too, not just if it's compiled.

There are also cases where non-compiled languages can beat out compiled ones. For example, lua with LuaJIT is incredibly fast, and beats out compiled languages with heavy runtimes (like java/C#) in quite a few microbenchmarks.


Java is compile to bytecode which is then interpreted by a virtual machine, it's not compiled to native code, that's why you cannot distribute Java programs as executable like go.


Is there a way to bundle the JVM (or the parts you need) with the Java app you'd like to run? Erlang (and Elixir) does this with the Erlang VM (BEAM), allowing you to bundle VM+app as a release. The Erlang release must be compiled for the platforms on which that release will be run. It's not quite as elegant as a Go binary, but it's still pretty handy.


Yes, there is a javapackager in java 7 and 8, but with jigsaw that comes with java9, it becomes easier: https://steveperkins.com/using-java-9-modularization-to-ship...


Sure you can, https://www.excelsiorjet.com/

Java has a specification and has lots of implementations.

The majority of commercial implementations of Java have native code compilers, and Oracle is in the process of improving the AOT compiler introduced in Java 9.


Java has a JIT that will compile to machine language during run time.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: