An Easy Way to Build Scalable Network Programs

_3u10 · on Oct 5, 2011

I'm completely unsure what quality of javascript makes it suitable for writing high performance systems.

Is it single threading? Is it the weird typeof crap you have to do to check if a variable is defined? Is it the lack of integers? Is it the prototyping system?

Node.js to me looks like a slightly better syntax than the horribly ugly C# async calls. (Not the new async/wait system). Javascript completely pales in comparison to F# or Haskell in terms of readability of async code.

If you prefer non-functional languages it would seem that Go would be a much better place to start for performance than Javascript. Or clojure or scala.

Sure node.js outperforms rails, but rails isn't designed around being the fastest webserver ever.

awj · on Oct 4, 2011

...it's not the recent blog posts that are the problem, those are a response to hype huffers in the community spouting scalability nonsense. Having vague wording in the front matter for the project's main page doesn't help.

Many of the people currently "bashing" node are more or less aware of its capabilities. They aren't mad at the technology itself, just that it's being sold as much more than it is.

baudehlo · on Oct 5, 2011

The only confusion over the wording on the front page seems to be over: "Almost no function in Node directly performs I/O, so the process never blocks". Which can be correctly read as: "Almost no function in Node directly performs I/O, so the process never blocks on I/O". But my English teachers would have told me that's too much repetition and not necessary.

awj · on Oct 5, 2011

If your goal is to appeal to "less-than-expert" programmers, this is exactly the kind of point where you want to be crystal clear. Blocking on CPU is the kryptonite of evented servers. If someone is new to them you probably cannot reiterate that point often enough.

baudehlo · on Oct 5, 2011

On the front page? Really? I'd expect that on a more detailed tutorial, but to get people interested in a technology, I completely disagree.

You don't see Apple plastering "ONLY AVAILABLE IN THE USA" on their pages about iCloud music storage for exactly this reason - you want to get people to say "Hmm, that sounds cool" before introducing them to the caveats.

ismarc · on Oct 4, 2011

I've had to write a few network based applications that each had their own unique performance requirements. Node.js would not have been a fit for any of them. I'd really like to give it a go, but it seems that every scenario where it would be useful is better served by a more specific environment. Granted, I'm pretty lousy at javascript, so getting to use javascript on the server doesn't count, what is Node.js' ideally suited for?

kylemathews · on Oct 5, 2011

Your comment would be a lot more interesting if you add what those network-based applications are that you think Node.js wouldn't be a good fit for.

ismarc · on Oct 5, 2011

Well, my comment got eaten, so here's the horribly summarized variant.

The captive portal, DHCP, DNS, firewall/access management, VLAN tag/trunk management, route management, VPN tunnel management, SNMP Poller, SNMP server, network device controller services running on the internet access gateway for sports stadiums on commodity hardware running Linux. Or those same services running on a resource constrained system with significantly lower traffic.

This is obviously a scenario where Node.js is infeasible (web based portions had to have C bindings for anything outside of presentation layer and all parts of the systems required significant profiling and optimizations to meet speed and resource usage requirements). However, I've had several applications recently where Node.js could have been a possibility, but the documentation and overviews for it are either so beginner that you can't determine what it's capable of or is exploring areas where there isn't precedent for what it can do. I couldn't find a straightforward list of supported features and the ideal use case for it.

After expecting a more detailed article covering ideal usage of Node.js, and that not being it, I figure I would finally break down and ask "What is Node.js' ideal usage? Under what scenario is it an ideal tool?" It's not a knock against Node.js at all, maybe a slight one against the documentation, but not the technology or its use.

gord · on Oct 5, 2011

I only know networking basics, so not sure about your other use cases, but I was pleasantly surprised how useful something like 'http-proxy' for node is.

I use it to serve up several hosts on the same ip, or split sections of my url namespace into several node.js servers.

Maybe look at NodeJitsu blog as the articles seem readable, and those guys seem to use node heavily for running their hosting platform.

baudehlo · on Oct 5, 2011

And yet you describe something which sounds ideally suited to Node (or any evented system) and call it infeasible. Node actually has pretty nice bindings to C (well, really C++), and I see nothing in your post that it would be incapable of.

If it was because you don't know (and didn't want to learn) Javascript I could understand, but in terms of your comment above it seems like you just didn't look hard enough.

catch23 · on Oct 5, 2011

not sure why you're being downvoted here, but I do agree on the c++ binding infrastructure. I think the things he listed are quite possible in node -- I can't see any of those things he listed that might be cpu bound.

A fairly popular small DNS system was written in node for the purposes of development. Anyone who uses "pow" knows what I'm talking about: http://pow.cx

IsaacSchlueter · on Oct 5, 2011

Agreed. If you have a network program that isn't IO bound, then in what meaningful way is it a "network program"?

sausagefeet · on Oct 5, 2011

Scalable web apps, maybe, but "network programs", I disagree. Just spawning a bunch of Node instances is insufficient to really scale in many network apps, you also need a good way to communicate between them. Preferably one that hides the fact that you are communicating between separate machines. For the most part, web apps can get by with pushing this to the DB but I think it's a bit much to say this is acceptable for all network programs.

CyruzDraxs · on Oct 5, 2011

https://github.com/hookio/hook.io

Transparent enough?

sausagefeet · on Oct 5, 2011

Not quite what I had in mind, but this is probably sufficient enough for me to eat my words.

kqueue · on Oct 4, 2011

No that's not the proper way of doing it. You create worker threads in separate process that receive data from node.js, encodes them and send them back. you don't fork on every request.

mikeryan · on Oct 5, 2011

Where is he suggesting forking on every request? My read sounds like he's suggeting the main process handles web requests and a single other process to handle encoding?

The suggested approach is to separate the I/O bound task of receiving uploads and serving downloads from the compute bound task of video encoding.

I'm assuming by using something like child_process.fork to create a video encode queue separate from the main event loop.

http://nodejs.org/docs/v0.5.4/api/child_processes.html#child...

_3u10 · on Oct 5, 2011

Once your doing that why not just use threads and get rid of the IPC?

That's kind of the point of the Node.JS bashing, once you work around all it's pitfalls you're right back where you started except your now writing your app in a language unsuited for the purpose.

Node solves the problem of needing to write evented servers in javascript. Beyond that I can't see much advantage in it vs. existing languages. If I wrote something called "Node.NET" which was a JScript wrapper around completion ports and went around telling everyone that this was the future of webdev... what do you think the reaction would be?

mjijackson · on Oct 5, 2011

He's not suggesting forking. He's suggesting spinning up a new process entirely (ffmpeg in this case).

wmf · on Oct 5, 2011

fork+exec is even more expensive than fork. (Although in the case of video encoding the overhead is negligible. Problems with Node.js are more likely to be seen when an event sometime takes 100-1000 ms; it's slow enough to hurt response times of other requests but maybe not worth farming out to another process.)

vogonj · on Oct 5, 2011

now writing a high-throughput server is simply a matter of writing a high-throughput worker to generate fibonacci numbers and connecting it reliably through interprocess communication to your node.js shim.

thank god node.js was there to save me all that work.

exogen · on Oct 4, 2011

I think a lot of people would find this comment more helpful if you explained why.

kqueue · on Oct 4, 2011

1. Process creation and termination are heavy operations considering that it's being done in a tight loop.

2. You don't want to fork on every request, this is very vulnerable to fork bombs.

3. In worker threads model, you already have the work threads spawned and ready for crunching, which will reduce the system load because you aren't forking on every request.

exogen · on Oct 4, 2011

1 & 3 might be conventional wisdom, but are you absolutely certain they're still true on modern platforms? http://blog.extracheese.org/2008/05/processes-spawn-faster-t...

(I'm not saying they aren't or that I know the answer.)

tmurray · on Oct 5, 2011

From an OS POV both of those tests are doing almost exactly the same thing: make a system call, have the kernel spawn a new unit of execution (whether it's a thread or a process), wait for the child to be scheduled, have it terminate, wait for the operating system to notify the parent process of termination. There's a little extra bookkeeping in this example for the child process, but not much at all thanks to copy on write. If you were to do something dumb from the child work like memsetting a 1MB buffer to 1, I assume the child process would be far slower due to page faults.

There are certainly advantages and disadvantages to both threads and processes, but it's not really a fair comparison to claim that processes are as fast or faster than threads because you can spawn them at a certain rate. The performance cost of separate processes is something you pay gradually, every time you have to take a page fault and copy 4KB.

jconley · on Oct 5, 2011

The post you link to spawns a new thread and tests that speed. When you have a pool it takes roughly zero ms to tell it to do something. All you have is whatever locking is required to synchronize between the enqueueing (request processor) and worker threads, which may even just be a volatile.

Then again, as someone else said, the amount of time it takes to spawn a process to encode a video relative to the amount of time it takes to do the encode is probably trivial, and you would benefit from having the process isolation in case something goes wonky.

You would probably write some sort of "process gate". Though in a distributed architecture I'd do this with some sort of distributed work queue.... I did this in .NET many years ago for a similar service: http://blog.jdconley.com/2007/09/asyncify-your-code.html

scott_s · on Oct 5, 2011

I suspect those numbers have nothing to do with the cost of the system creating a thread or a process, but instead are artifacts of how Python handles threads.

When Python forks a process using the multiprocess module, that process can execute concurrently with the parent process. On a mutlicore machine, it can be simultaneous.

When Python spawns a thread, the thread and the parent process cannot execute concurrently. They need to grab the Global Interpreter Lock (GIL). Whoever has it can execute. Whoever does not must wait.

So, I suspect that what we are seeing is that even though the new processes/threads have very little work, the processes can exit faster because they don't have to wait for the parent process to give up the GIL. This is a misguided experiment.

exogen · on Oct 5, 2011

Yes, I suspect that you're right; I also thought it might be due to the GIL.

I reran this test with Python 2.7 and it no longer appears to be true:

  Spawning 100 children with Thread took 0.03s
  Spawning 100 children with Process took 0.28s

I'm not sure to what extent the GIL was improved in 2.7, but it's possible that it was never the cause to begin with.

Regardless, I don't think it's a misguided experiment – it was an objective observation. It shows that things aren't so black and white depending on your toolchain.

scott_s · on Oct 5, 2011

I think the experiment is misguided for several reasons. One, process/thread creation time is negligible. The general approach is to create worker threads/processes that live for the lifetime of the program. Then you farm work out to them as needed. This separates the concept of "doing work" from their actual execution.

Two, threads don't buy you parallelism in Python, unless the majority of the work is being done in C modules.

Finally, this test is really just testing the multiprocess and thread packages provided by Python. I say this is misguided because the way the author talks about it, I don't think he understands that the difference between those abstractions and OS threads and processes. (Which, of course, are an abstraction as well.) I suspect the Python overhead will be more than the difference in cost between forking OS-level threads and processes.

humbledrone · on Oct 4, 2011

It seems to me that given the task of video encoding, the overhead of a fork is probably going to be trivial. I do agree with point #2, though.

maratd · on Oct 4, 2011

In coming releases we’ll make it even easier: just pass --balance on the command line and Node will manage the cluster of processes.

Internet trolls do improve software!

CPlatypus · on Oct 5, 2011

I'm going to repeat what I said on Twitter when this first came up - dozba's computationally-intensive-task example doesn't really illustrate the problem. Even if the computation is buried somewhere in a library, you can more or less predict when it's going to happen and make sure it happens in a separate thread/process. The real hurt comes when your single-threaded server takes a page fault. That's nowhere near so predictable or easily solved, and it still results in your entire application stalling. Requests on other connections, which never needed to get anywhere near the page that caused the fault and which could have continued in a better design, will get caught in the stall. That's just as true and just as lame as it was almost a decade when I (e.g. http://pl.atyp.us/wordpress/?page_id=1277) and plenty of others were writing about exactly these issues. Single-threaded servers are only appropriate for workloads where requests are trivially partitionable. In other cases you can still use events and asynchrony but you should do it in a framework that is inherently multi-threaded to take advantage of multiple processors/cores.

wmf · on Oct 5, 2011

Some of us never tire of this topic. :-)

To work properly, it seems like an event-driven program really needs to use mlockall() and hopefully get memory pressure feedback from the kernel.

baudehlo · on Oct 5, 2011

It's true, a seg/page/fault would take down the whole server, and any requests executing on that process would die too. However in fairness, this is one of the advantages of using a dynamic language, rather than one where you're dealing with memory allocation all the time.

cdavid · on Oct 5, 2011

I think you are confused about what a page fault is: that's what happens when the memory manager of your OS needs to load a page from the disk ("swap"). Dynamic language does not help you at all for that. It does not help you with segfault either BTW, because typing does not have much to do with memory management. Many statically typed languages cannot have segfaults either, from java to haskell.