Building A Node.js Server That Won't Melt

spartango · on Jan 15, 2013

This is definitely neat; it's a simple, clever way to fail somewhat-gracefully.

Interestingly, this is one piece of a broader strategy for building scalable web applications, addressed in Matt Welsh's thesis (SEDA):

http://www.eecs.harvard.edu/~mdw/proj/seda/

If you haven't taken a look at the ideas in SEDA, it's definitely worth the time. Most modern web apps incorporate at least some pieces of it to great effect.

kenshiro_o · on Jan 16, 2013

That's a very good technique! I did not even know about the event loop lag until now... It feels good to get a daily education from Hacker News!

mnarayan01 · on Jan 15, 2013

  // check if we're toobusy() - note, this call is extremely fast, and returns
  // state that is calculated asynchronously.  
  if (toobusy()) res.send(503, "I'm busy right now, sorry.");

Saying it is calculated "asynchronously" seems pretty confusing in this context. Maybe "is cached at a fixed interval"?

firefoxman1 · on Jan 15, 2013

Side note: caching at a fixed interval is a great way to save CPU cycles. If you're serving 100req/s and you had to grab the current time within 1sec accuracy for them, you could have 1/100th the amount of Date calls by getting the time each second instead of each request.

lloydhilaiel · on Jan 15, 2013

I agree. /me fixes

wereHamster · on Jan 16, 2013

Nope, it is literally calculated asynchronously. Look at https://github.com/lloyd/node-toobusy/blob/master/toobusy.cc... and lines 91/92 which sets up the background timer.

jeffasinger · on Jan 15, 2013

Why you would do this, instead of putting your node server behind a more battle tested proxy, like Varnish or HAProxy, and limiting the number of simultaneous connections.

amatix · on Jan 15, 2013

Do both: each layer and service in an application should be able to manage its own load. Helps prevent cascading failures too.

Read up on Netflix's architecture, where they've talked about this a lot. And the book "Release It!: Design and Deploy Production-Ready Software" (Nygard), though Java-oriented in its examples, covers the concepts well.

lloydhilaiel · on Jan 15, 2013

Our problem is the the number of simultaneous connections we can support isn't static, it varies depending on the type of traffic bursts (i.e. New vs. Returning users).

But agree that a higher level proxy is useful - because this way your overloaded application doesn't even need to deal with traffic (we hope to use toobusy to have applications instruct the routing layer to temporarily block / reroute traffic in times of load).

jeffasinger · on Jan 15, 2013

The reason that I asked that is I feel that most applications end up needing something at that layer eventually anyways, but not knowing exactly what level of traffic results in too much load is a good reason to do this.

davidw · on Jan 15, 2013

It seems to me that letting the computer calculate when it's in trouble is preferable to have a human guessing.

gfloyd · on Jan 15, 2013

I think this would simplify implementing a load-balancing proxy in some ways. If a server responds with a 503 generated by node-toobusy, the load balancer will know immediately to reroute the request to another server, rather than having to wait for a timeout or some other threshold.

toong · on Jan 15, 2013

So this load-balancer is balancing between A&B. Now A gives back a few 503's, you are going to send everything over to B ?

[edit] I do get your point, I was just saying: don't start implementing a naive load-balancer :-)

benologist · on Jan 15, 2013

You'd do it because it's

  npm install toobusy

then somewhere in your project

  require("toobusy");

If you only need one server that's going to be up to a lot faster than putting anything in front of it.

byoung2 · on Jan 15, 2013

If you only need one server that's going to be up to a lot faster than putting anything in front of it.

A single server setup might be simpler, but it won't be faster. Varnish serving a cached page from memory is going to be faster than 5 asynchronous calls that take 5ms of CPU time (and filesystem I/O in the case of the database and template given in the example). With varnish, even with a 1 second TTL (and 1 second grace), your first request will take the 5ms hit, but the next 199 for that second will be served from memory.

Now with Varnish serving 199 out of 200 requests from memory, if your backend is still toobusy, by all means serve a 503, and Varnish can cache that too.

AndyKelley · on Jan 16, 2013

You have a problem, and so you try to solve it with caching. Now you have two problems.

I think that's how that quote goes.

byoung2 · on Jan 16, 2013

I think of caching for apps like clothes for people...while you could survive naked, it's more comfortable with clothes, and you're protected from heat, cold, etc. Of course there is the problem of being over/under dressed, but you have to wear something.

benologist · on Jan 16, 2013

That's super unnecessary for a single server, you can literally just use a variable outside of your request handling as a cache.

  var cache;

  module.exports = function(request, response) {
    if(cache) {
      return response.end(cache); 
    }

    // get my data from wherever
    cache = the_data;

    return response.end(cache);
  }

Now 199 out of 200 requests are from memory, there are zero extra moving parts, and you're using a cool part of the language instead of a 3rd party tool you have to select and configure.

byoung2 · on Jan 16, 2013

How do you selectively serve the cached response to some users based on cookies, header, etc., and expire it after a set amount of time? How would you gather stats about how many cache hits vs misses you have? How do you serve the cached response when your server is pinned? These are problems 3rd party tools have solved.

benologist · on Jan 16, 2013

You can expire things with setInterval and a counter. You serve a cached response if your server has a cached response, or if you are using it as a fallback you would combine it with something like toobusy maybe so you have a functional (if not fresh) 'under load' page.

Where it gets really fun with NodeJS is you can do all of your data work outside of the requests so you can pull all of your content out at the start and refresh it on an interval independently of the users, which can eliminate some or all of their trips to the database if you're lucky and it fits in ram and is viable etc.

The less moving parts that need to cooperate to serve your site the better - most things don't warrant a deep stack of technology to serve HTML and run CRUD operations.

jongleberry · on Jan 16, 2013

and what if your whole site is dynamic? varnish doesn't do anything.

byoung2 · on Jan 16, 2013

Very few sites are so dynamic that something can't be cached for at least a second. Even a heavily dynamic site like HN with people commenting all the time could at least cache for logged out users.

lazyjones · on Jan 15, 2013

I'd prefer that solution too, especially since such proxies can mitigate some types of DoS attacks (e.g. known patterns in the "Referer" header, or from known IP address ranges) and load will generally be much lower if some of the popular content is cacheable.

Cushman · on Jan 15, 2013

Because that is a lot of work, and this is a drop-in solution for any node application?

cpeterso · on Jan 16, 2013

If you are writing server software, I recommend Michael Nygard's book "Release It!: Design and Deploy Production-Ready Software". Measuring event loop lag sounds like Nygard's "Circuit Breaker" pattern to avoid cascading failures.

The book's examples and text are all Java, but the lessons are applicable anywhere. He offers many scalability patterns (resource pools) and anti-patterns (runaway log files) with interesting stories from his experience debugging real systems. I especially liked his story about debugging a crash in an Oracle DB driver that caused unexpected Java exceptions to be thrown from java.sql.Statement.close(), which quickly blocked a DB connection pool.

http://michaelnygard.com/

jconley · on Jan 15, 2013

This is great to see! IMO it's something that should be configurable and built into http in node.js.

IIS [1] and other mature web servers like nginx [2] or apache [3] do this kind of thing for you and provide simple configuration on it.

[1] http://support.microsoft.com/kb/943891 [2] http://serverfault.com/questions/412323/nginx-503-error-in-h... [3] http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxrequ...

toast0 · on Jan 16, 2013

Apache's MaxRequestsPerChild directive is not at all about limiting load; it says after serving N requests (or N connections in a keep-alive setting), the worker kills itself and another may be spawned in its place (subject to spare server/thread config). This mostly helps keep slow memory (or other resource) leaks in check by starting fresh every so often.

Did you mean to link to MaxClients[1]? MaxClients sets the maximum number of simultaneous connections; any additional connections will be queued by the OS socket api (subject to listen backlog, etc).

I think waiting to accept sockets that you can't handle is a better solution than either accepting a socket to return an error message, or (much worse) accepting a socket that overloads your system. Unfortunately, sometimes it can be hard to set MaxClients to the right value that isn't so big that you get reduced througput, or too small that you don't use all your resources. (One thing that does help you get to the right number for MaxClients is to set MinSpareServers to the same value as MaxClients; you will avoid issues where MaxClients is too big and you start swapping during high load, but you don't notice it because things are fine with a small number of servers).

Sorry, that was way too much information.

[1] http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxclie...

tlrobinson · on Jan 15, 2013

Node itself is a bit low level for this functionality, but it could be built into frameworks like Express.

chetanahuja · on Jan 16, 2013

This is a neat little technique and the fact that the developers did this puts nodes.js higher in my stack of "technologies to consider".

A harder problem is to solve the same "melt-out" problem for the entire gamut of servers that are usually found in large serving stacks (various web servers, relattional and nosql db servers etc). The users are usually not the developers of these pieces of technology and there's never enough dev-bandwidth available to actually implement self-tuning techniques like the one mentioned here.

For such situations I once came up with a "little" trick to limit the damage in sudden overload situation. It allows the admins of said servers to tune the threadpool and request queue lengths in an intelligent way (with the help of historical performance data that most shops should have). I mentioned it in this discussion thread here: http://www.linkedin.com/groups/Whats-generally-used-methodol...

j45 · on Jan 16, 2013

Forgive my ignorance, but I thought the whole point of Node.js was not to meltdown.

Are there some recommend resources I could learn more from about the out of the box scaling vs tactics and strategies that benefit growth? I love learning how this problem is tackled anywhere. :)

nakkiel · on Jan 16, 2013

There's only so much load any piece of software running on a given and finished set of resources can handle properly... Once limits are reached, things will misbehave.

j45 · on Jan 16, 2013

I agree completely, have lots of experience in complex hosting.

With all software having it's limits (assuming hw is relative) has a starting and ending point of when you need to start tweaking it, and in what direction. I was hoping to read more about that.

scottmey · on Jan 15, 2013

I've been looking for a solution for this, every time I had used 'nodemon' it couldn't handle the load, even if it was quite insignificant... perhaps I had misconfigured something, but I'm pleased to find this, good timing!

ttty · on Jan 15, 2013

Instead of returning a 503 page it could attach a ticket (by cookies) and say: "Your request will be solved in 55 seconds, be patient". The time calculation is based on statistics in the last 10 minutes or 1 hour...

Jare · on Jan 15, 2013

The problem is not the lag before resolving the request, it's the lag the user finding out that the request will not be solved. Underlying this is the fact that the server simply does not have enough power to resolve all requests, so some must fail.

The trick of using event loop lag is pretty neat, but the general strategy is IMHO a must have for any service. By aborting early and somewhat gracefully, the user knows ASAP, receives a controlled message that suggests she should not quickly retry, and the server can avoid trying to do part of the work for the request that is not going to be completed.

dtwwtd · on Jan 16, 2013

This is pretty cool. It seems like 10 seconds is still a pretty long time to wait for a response when connections over the max limit are being dropped. Is there a reason for this or some way to tune it?

jcampbell1 · on Jan 15, 2013

Using event loop lag is pretty clever. I think I first learned about event loop lag the hard when trying to do DOM animations that relied on setTimeout intervals without checking the current time.

politician · on Jan 16, 2013

It'd be interesting to experiment with pairing this to a proof-of-work system (e.g. Hashcash) to smooth the transition between toobusy and !toobusy. With an additional signalling mechanic, it could help the surge against peers in a cluster when one server begins to struggle.

Though you'd need a custom client, so maybe for appcache'd webapps or mobile apps, but not typical webapps. Or browser vendor participation (not serious (ok, half serious)).

EGreg · on Jan 16, 2013

And what happens when there are too many requests to handle even this way?

donebizkit · on Jan 15, 2013

Great tip! Can you explain how you figured out the 200 request/sec limit?

lloydhilaiel · on Jan 15, 2013

That was specifically for the example server - if each request uses 5ms of processor time, (1000 ms / s) / (5 ms / req) == (200 req / s)

https://gist.github.com/4532177#file-application_server-js-L...

n0cturne · on Jan 15, 2013

But will it blend?

lloydhilaiel · on Jan 15, 2013

like, make you a smoothie? Or do this: https://github.com/lloyd/node-toobusy/blob/master/toobusy.cc...

creativename · on Jan 15, 2013

A joke, I believe: melt vs. blend

tbatterii · on Jan 15, 2013

i thought node.js was fast as hell bad ass rock star tech that would never melt.

http://www.youtube.com/watch?v=bzkRVzciAZg

taf2 · on Jan 15, 2013

well it's still pretty bad ass and rock star that you can measure server load by looking at "event loop lag", that's actually really neat... how do you do that with threads?

tbatterii · on Jan 15, 2013

"how do you do that with threads?"

how should I know? :)

haileys · on Jan 16, 2013

by using system load averages