Is NodeJS Wrong?

jaekwon · on March 7, 2011

Given any long-running task, there are two natural things to do:

  1. Do something while the long-running task is running
  2. Do something after the long-running task has completed

Node.js' callback function convention simply makes it stupendously easy for you to write code for both cases, and leans towards making case (1) as natural as possible. Case (2) is naturally easy in Javascript (and even easier in Coffeescript) because of the way Javascript supports closures.

  application = function() {
    // do stuff
    database_call( options, function(err, callback) { // javascript closure
      // case (2) logic
    } );
    // case (1) logic
  }

How does Tora fare on this regard? Node.js is about parallelism through forking closures, not any solution to long running CPU intensive tasks.

The weakpoints of Node.js, in my opinion (and please correct me if I'm wrong, that would make me happy), is having to write a lot of boilerplate code for error and exception handling. Exception handling is particularly onerous because depending on the function call you'll never know which stack the exception will travel up. In the example above, you (probably) can't catch an exception from case (2) by wrapping the database_call in a try-catch block. Does anyone have a good solution to this?

All-in-all, I like that Node.js tries to stay pure by enforcing the above convention. Noders who stay in "userland" can't shoot themselves in the foot by making an asynchronous procedure synchronous, which may lead to faster code overall.

silentbicycle · on March 7, 2011

To me, it seems the real weakness of node.js is having to manage asynchronous control flow yourself, at all. It grows exponentially more ugly as the system increases in size. I really like continuation-passing-style as a technique (especially as a compiler IR), but doing that stuff by hand is so 70s.

I got pretty far along writing a similar system in Lua (because, hey, event-driven systems are pretty nice), and unlike Javascript, Lua has well-implemented coroutines - they singlehandedly eliminate a LOT of the ugly control flow management. Still, most libraries are blocking* . I eventually decided that, for small systems, an event-loop framework wasn't necessary in Lua (it's easy enough to just do it from scratch!), and for larger systems, I was better off using Erlang, which has the ideal foundation for that sort of system. In particular, I'm really not clear how much error-handling node.js does; that was what convinced me that Erlang's approach is refreshingly sane. When I started writing process-supervision and hot update code in Lua, I realized I was just re-implementing what Erlang does best.

* Which is a funny objection, like being mad that most libraries default to using decimal numbers rather than octal. It must be a conspiracy!

See also: http://news.ycombinator.com/item?id=2150800 and http://news.ycombinator.com/item?id=1304599

beagle3 · on March 7, 2011

  > Still, most libraries are blocking*
  > * Which is a funny objection, like being 
  > mad that most libraries default to using
  > decimal numbers rather than octal. It must
  > be a conspiracy!

Actually, there _is_ a distinction here: It is ridiculously easy to implement blocking semantics on top of non-blocking semantics - in some pseudo code:

  event = do_async();
  pause_until_completion_or_error(event)

Whereas implementing non-blocking semantics using an underlying blocking implementation is quite, but not entirely, like banging your head against the wall. repeatedly. Essentially, you cannot avoid threads, shared mutable state, and a lot of other problems.

jaekwon · on March 8, 2011

This may surprise some, but implementing blocking semantics on top of non-blocking semantics in node.js is impossible.

jaekwon · on March 7, 2011

Thanks for this, you've nailed the problem.

I'm looking for a language that can do everything that Node.js does but also solves this control flow problem, but is also fast and has support for OpenCL and beefy math libraries.

I don't find Erlang to be performant in this regard, so I'm looking at Scala but I'm constantly getting pissed.at.java.packages.and.conventions. I think it also suffers from the same problems as Node.js and doesn't support the async/control-flow magic that Haskell supposedly has.

Perhaps I should consider Haskell after all. It's that or C++.

malkia · on March 7, 2011

Take a look at LuaJIT.

I have some early FFI bindings for OpenCL (and OpenGL, glfw) with some simple demos. Not updated recently, but it's done by using stock luajit (take latest), and not writing a binding code at all (just FFI definitions which LuaJIT understands).

You can't do callbacks (yet). That is - you can't rely on "C" function to call you back at lua land. That's why I chose glfw instead of glut or others. There is way to setup your main loop without callbacks.

http://github.com/malkia/luajit-opencl

justincormack · on March 7, 2011

There is also a coroutine based embedding of Lua/Luajit in nginx if you want to embed into a web server environment (without callbacks). https://github.com/chaoslawful/lua-nginx-module

sayhello · on March 7, 2011

Perhaps you could try python, numpy, twisted, inline callbacks and one of the python opencl packages.

Inline callbacks are syntactic sugar included recent versions of twisted that make your code look synchronous while still running on the event loop.

With python, you get to use a wealth of libraries, but you need to still be mindful of what is blocking.

Wnen something you need to call is blocking, you can try deferToThread.

Don't worry, you won't need to think too much about race conditions and other concurrency issues with threading since there is the GIL in python. Also, Twisted makes it a cinch.

Good luck!

tibbe · on March 7, 2011

Haskell (or rather GHC) offers a blocking programming model (e.g. spawn a thread per connection and make blocking reads/writes on the socket) but uses asynchronous I/O in its implementation (one thread uses epoll/kqueue/poll to do the I/O and the CPU bound threads are scheduled on a thread pool).

newgame · on March 7, 2011

Maybe, in a couple of years, javascript (or any language like coffescript that compiles to javascript) will fit your needs, especially regarding the control flow problem.

Here's an excerpt of a blog post (http://blog.mozilla.com/dherman/2011/01/30/proper-tail-calls...) from a research engineer at Mozilla Labs who works on the new ECMAScript standard:

  >Having an officially guaranteed tail call mechanism   
  >makes it possible to compile control constructs like 
  >continuations, coroutines, threads, and actors. And, 
  >of course, it’s useful for compiling source languages 
  >with tail calls!

Of course, it is a long way to go until one is able to use these features. So, for now, you are better off to look somewhere else for your specific needs.

rubyrescue · on March 7, 2011

You could build the skeleton in Erlang and then call out to Lua. This one works well...

https://github.com/Eonblast/Erlualib

fab13n · on March 7, 2011

We developed a coroutine-based event-driven networking lib in Lua (not open sourced yet, but it looks like it might change someday). One of the neat thing is that we're able to run that on tiny ARM-based embedded hardware.

I think there are people who did similar stuff for classic PC hardware, and already open-sourced their work; you might want to look at nmap, among others.

The external API is a superset of Luasocket's, so the event loop can remain completely transparent for users if they choose so. Proper coroutine support really rocks!

gruseom · on March 7, 2011

Several commenters point out that long-running computations can be performed outside the main request thread without the server having to do anything special.

It's also possible to divide up long-running computations oneself. This can lead to very interesting designs.

jerf · on March 7, 2011

"Several commenters point out that long-running computations can be performed outside the main request thread without the server having to do anything special."

Then why am I using Node.js? Any language has been able to do that for over a decade now, without the other hoops Node.js's style forces you to jump through. As evil and bad as shared-state multithreading truly is, this is a (or possibly "the") task it can manage without blowing up.

This is all but an admission that Node.js actually has no concurrency story at all. Hardly surprising, since it doesn't, unless "We force the programmer to do all the concurrency work" counts.

"This can lead to very interesting designs."

Yes, in the "may you live in interesting times" sense of interesting, absolutely. If you're going to reduce yourself to manually scheduling everything yourself why boot an OS at all? (Yes, that's a bit of an exaggeration, but seriously, think about it for a bit, there's truth there. Runtimes/VMs/languages ought to be adding to the OS, not fundamentally subtracting from it.)

gruseom · on March 7, 2011

I don't really understand your comment.

No one's saying that IPC is unique to Node. The OP's criticism was: async i/o is fine, but what if you have some CPU-intensive work to do? Isn't it bad to let that block the whole server? Of course it is. OP's answer is a different server architecture; Nodians' answer is just forward that work to a different process and let Node keep doing the one thing it does well.

Why use Node? Isn't the reason that it lets you write server apps that don't block on i/o, in a high-level language?

From various comments over the last year or so, I gather that you're saying Erlang beats Node hands down at this. That may be true. Still, not everyone's going to use Erlang. What other alternatives are there? (Twisted, EventMachine, ...?)

I can see the attraction of Node's approach. First, it's conceptually simple. Second, yes, it shoves a bunch of things in your face and makes you deal with them - but they are precisely the things that make your program slow. Perhaps you want to deal with them. I can understand why someone would say: I want to manage my program's control flow explicitly so I know it won't block when it shouldn't; it makes some things annoying, but other things easier (at least I don't have to worry about other code interrupting mine); just don't make me write everything in C.

(At risk of being tedious, I'll add that I'm not being polemical. You, silentb and others know a lot more about it than I do. I'd like to get clearer about what the issues are. Also... I'm tempted to reply to the second part -- about writing programs that know how to divide up their computations, and whether this is greenspunning the OS -- but this is long. Maybe we should defer it.)

gtani · on March 7, 2011

What other alternatives are there? (Twisted, EventMachine, ...?)

Lots, depending on what you're doing: scala actors, akka, STM in haskell and clojure, GHC (lightweight) thread manager, F# async's and MailboxProcessor; apple GCD; Microsoft Message Queuing (MSMQ),Completion Ports. ZeroMQ, rabbitMQ/AMQP

(but you should spend some time looking at erlang)

jerf · on March 7, 2011

"What other alternatives are there? (Twisted, EventMachine, ...?)"

For every major high-level language, there is at least one Node-like library, and sometimes more than one (Perl has POE, Event::Lib, based on my experience the raw glib wrapper isn't half bad albeit perhaps not the fastest, but you get good access to anything else based on glib, in fact Perl has so many that there's an Any::Event wrapper to remove your dependence on the underlying event library!). My point isn't that Node.js is bad. I actually don't think it is.

My point is that the hype is bad. It's wrong to think it's bringing anything unique to the table, because it simply isn't doing anything that has not be done literally dozens of times, except it's doing it in Javascript. If you want it done in Javascript instead of Python, more power to you. I'm particularly incensed by the idea that Node.js' approach to asynchronous is the only way to do it and the number of people it has produced who it has anti-educated into thinking Haskell and Erlang and all kinds of other languages can't possibly be asynchronous because you can't see the manually-chopped-up event handlers in the code. I'm not guessing. I've met these people online. You may know better, but a lot of people don't; whether or not it was intended the hype is actually lying to people about the state of the programming world, comparing itself to the world of 1995.

I am also trying to speed up the education cycle that all of those other dozens of attempts have been through in which manually-compiled event-based programming inevitably explodes into unmaintainable complexity, and none of the dynamic languages, including Javascript, have the necessary constructs to truly contain it. Some of the dynamic languages are even more powerful than today's Javascript, such as Python with its generators (though ECMAScript is supposed to be getting those, I don't know if any browser has them yet) and it's still not enough. The structure of event-based programming demands such an explosion. Been here, done this.

You can see it already starting to poke out from under the hype, if you're watching carefully. This is going to get worse, not better (because there isn't a solution, just a variety of hacks long since tried and found to only slightly improve things at significant complexity cost themselves), and I'm actually trying to do the community a favor by deflating the balloon so it doesn't pop so hard.

(If you know Haskell, and you look at the implicit type signatures being put on things like callbacks, it becomes easy to see the problem. The clearest place to see the problem is a function that takes a callback for something, until one day you need to pass in a callback that itself has to go do something that requires a callback and suddenly you've got a big problem. The usual callback in Node.js is actually just a relatively-pure function, they are not in IO, which is done behind the scene for you. Then when you need to do something else, you've got some real problems. Solvable, yes, but at a fairly significant complexity cost, partially because any given issue can be addressed but you can't really address all of them simultaneously (simplicity, exception correctness, dealing with control flow across callbacks, etc.). Every time you write a callback or choose where to break the function up into a callback you're actually laying down far more restrictions on the code than you can easily see, but I don't go to this explanation very often because by the time you can understand it it is also borderline obvious.)

What other alternatives are there? One, use Node.js with awareness of the issues. There are places where it is fine. I just would incredibly strongly anti-recommend it if you know you're going to be continuously developing whatever you're building in it, especially your core product, rather than writing "a proxy socket server for web sockets to conventional sockets" and being done at some point. The other alternative is to actually work in a language/runtime where you don't have to manually perform all this tedious work. There's a number of them coming out and one of them is probably going to go mainstream at some point; of the current lot Go would be my best guess. Google isn't pushing it, but it's still got Google's name on it, and I don't know of anything else right now with the equivalent name power. History suggests name power is necessary for a language to crack the mainstream in anything less than 15 years. It probably requires the least adaptation to a new style of the bunch, the other advantage it has from the mainstream point of view.

gruseom · on March 8, 2011

I friendlily (!) request less anti-hype and more explanation of the technical issues, preferably with illustrations in code. For example, in the above post there is one point at which you come close to being specific and then back off, saying it's borderline obvious. It wasn't obvious to me.

There is one point at which I somewhat follow you. You say that the complexity introduced by callback management grows nonlinearly with program complexity. I understand this to mean that logic organized into async callbacks isn't composable (you have to write new logic to implement the composition, as opposed to just applying some operator to combine them) and isn't orthogonal (if you want to call some code that's written this way, your code also has to be written this way, and each new layer gets harder to add). It's easy to see how this could rapidly get out of control. But I'm not convinced that it must. There may be designs that nip this complexity in the bud. For example, the work done by callbacks themselves could be kept to a minimum (and preferably be standardized, i.e. when i/o is received, store the result in some standard place). In this way the callback chains always return as quickly as possible. Of course then you need some parallel strategy for managing the control flow of the program itself - some sort of state machine, perhaps.

Perhaps this is greenspunning Erlang but if so I'd like to know how.

jaekwon · on March 8, 2011

From my experience with Node.js, your interpretation is correct. I also share your optimism for finding a general solution to this problem, but that's exactly what jerf said isn't worth the complexity once you implement it.

Even if you do manage the callback complexity issue, there is still the issue of exception handling, which jerf also explains here: http://news.ycombinator.com/item?id=2150800

That said, jerf hasn't proven that trying is not a rite of passage. Sorry for the double negative.

gruseom · on March 8, 2011

Not a general solution. An app-specific solution. That is, I don't want a framework; I just want a consistent simple design for an individual app written in this style. That's a big difference.

silentbicycle · on March 7, 2011

Absolutely, I just wonder if, for most people, that just means re-discovering Unix IPC.

mathgladiator · on March 7, 2011

This is a bit annoying since if you want to make three async requests you have to write the following

Well, beyond a style issue. You can unfold the events and make it a lot less "annoying".

Consider,

var cb1 = function() {

}

var cb2 = function() {

  req(cb1);

}

var cb3 = function() {

  req(cb2);

}

req(cb3);

It's a lot cleaner and it looks like a state machine (and kind of looks like erlang/message passing).

silentbicycle · on March 7, 2011

More than anything, the fact that node makes you dwell on this AT ALL is a bad sign. It's like wondering if your indentation is causing bugs.

thegoleffect · on March 7, 2011

Another way is to have an array of functions and just loop||recurse through them. This gives the illusion of serial execution.

There are actually a number of other ways to get around the function nesting issue as well.

travisglines · on March 7, 2011

Tip: make sure you hit space two times before writing your code so it renders correctly:

var not_formatted;

  var formatted;

mpk · on March 7, 2011

That's actually quite a painful way to write async code because the control flow is now inside the functions. That makes it very hard to perform refactoring.

Might I suggest you take a look at my preferred solution for this,

    var cb1 = function(callback) {
        callback();
    }

    var cb2 = function(callback) {
        req(function() {
            callback();
        });
    }

    var cb3 = function(callback) {
        req(function() {
            callback();
        });
    }

    sequence([
        cb1,
        cb2,
        cb3
        ]);

The sequencer is a simple lib function you can find at https://github.com/michiel/sequencer-js or using 'npm install sequencer'

mjfisher · on March 7, 2011

I've actually been put off Node a little because I've not seen any examples using this style of code. I had (perhaps stupidly) assumed that the scope of a closure was necessary for a lot of Node functionality; generally speaking, are all the parameters required for interacting with Node passed via the function parameters?

Dn_Ab · on March 7, 2011

That first example could do with some monadic sugar - similar to how F# does Async.

http://blogs.msdn.com/b/dsyme/archive/2010/02/15/async-and-p...

andrewstuart · on March 7, 2011

Can anyone suggest a better architecture than that of node.js for a server side Javascript application engine?

Perhaps still using Google V8 but maybe being more intelligent about threading/multicore, maybe using something like gearman (gearman.org) to distribute tasks, and addressing some of the criticisms of node.js but still maintaining good performance.

rduerden · on March 7, 2011

perhaps http://n.odecs.net - still in it's infancy but being built in C# on mono and integrates with threading right off the bat

wallfly · on March 7, 2011

The author mentions CPS...

There already exists a powerful cross-environment JavaScript CPS implementation called JooseX.CPS:

https://github.com/SamuraiJack/JooseX-CPS

Tutorials:

http://joose.it/blog/2011/02/14/joosex-cps-tutorial-part-i/

http://joose.it/blog/2011/02/22/joosex-cps-tutorial-part-ii/

If you're not familiar with the Joose object system (works great in browsers and node.js), you should give it a look: http://joose.github.com/Joose/doc/html/Joose.html

Also, CPS is not the only option for dumping callbacks in browsers / node.js. Another would be the Functional Reactive style. See Flapjax: http://www.flapjax-lang.org/

I'm working on a reimplementation of Flapjax right now: https://github.com/michaelsbradleyjr/Jolt

It's got Joose under the hood and I'm generalizing all the library functions for n-ary EventStreams and Behaviors (Reactive concepts). It's very much a work in progress and the test coverage is non-existant atm, but that's owing to the fact I'm working from an existing, working code base. As soon as I have all the core estream and behavior facilities in place, I'm planning to write some exhaustive tests that use JooseX.CPS together with the Joose3 author's Test.Run library: https://github.com/SamuraiJack/test.run

jmarranz · on March 7, 2011

Yes it is deeply wrong http://www.theserverside.com/news/thread.tss?thread_id=61693)

Node.js + Web Workers (http://bit.ly/dAnzEQ) makes more sense (in spite of I think all of this is crazy)

br1 · on March 7, 2011

This reminds me of a very neat trick from Windows IO completion ports. When you mix blocking and nonblocking tasks, you want more threads than the number of cores, so that the CPU is fully used even if a thread blocks. But when no thread blocks, overbooking introduces context switch overhead. Windows' trick is to track threads that got work from a IO completion port and when one blocks, wake up another thread waiting on the port. This way you can have 6 threads working a IO completion port in your quad-core CPU and adapt both to mostly CPU intensive work and blocking work.

aufreak3 · on March 7, 2011

It looks like the compute situation in Node.js can be helped with a server side web workers implementation .. with the workers doing no IO and only computes. Any thoughts?

jdub · on March 7, 2011

Already exists. Do whatever you want in a webworker, using (very nearly exactly) the same API as you would in an HTML5 client:

https://github.com/pgriess/node-webworker

aufreak3 · on March 7, 2011

Neat! .. I should've googled before posting :)

Tichy · on March 7, 2011

NodeJS already provides some tools to delegate work to processes. I think it is just more low level than the framework he describes.

gtani · on March 7, 2011

prior threads/debates re: coroutines, Lua

http://news.ycombinator.com/item?id=2101210

http://news.ycombinator.com/item?id=2276840

dashr · on March 7, 2011

Now, the world don't move to the beat of just one drum, What might be right for you, may not be right for some.

augustl · on March 7, 2011

> Criticizing being always easy, let's see how we can make things different :

Good point!