Is the main advantage of Tornado that when you have a site open in the browser, ...

Smerity · on Feb 21, 2011

Tornado is a high performance non-blocking web server. As it's non-blocking it can handle 1000s of simultaneous connections (see C10k problem[1]) and is well made for long polling style applications (and in fact includes a chat application[2] in the demos).

[1] http://en.wikipedia.org/wiki/C10k_problem [2] https://github.com/facebook/tornado/blob/master/demos/chat/c...

sudonim · on Feb 21, 2011

Cool. I think what threw me a little was the term "Non-blocking".

After some basic research it seems that means communications between client and server happen asynchronously? Apache and other blocking web servers must have a start and end for every file or stream and has to create a new instance (or fork the process?) to serve another client. Tornado does not require a connection with a client to complete in order to connect with a new client.

epynonymous · on Feb 21, 2011

traditional web servers like apache 2, create processes to handle web applications that are developed in say ruby or python (e.g. phusion passenger, mod_wsgi). from what i understand (correct me if i'm wrong) these processes inherit large overhead in terms of having to load large numbers of libraries so memory usage is quite wasteful.

pjscott · on Feb 21, 2011

Shared libraries are just that: shared. They get memory-mapped, and the code is shared between processes. When a program has many threads open, something similar happens: the code is shared between the threads, but they get different stacks. Those stacks are a problem if you're spawning tens of thousands of threads, one for each connection you have open concurrently. You may find yourself running out of memory, or at least using way too much.

What Tornado does instead is serve a large number of connections from a single thread, using epoll to block on a group of threads until one or more are ready for I/O. This has much lower memory overhead, and does not require switching among threads.

This kind of I/O model is at its best when you have a huge number of connections, idle most of the time, and there are very few CPU-intensive operations. Long polling usually fits that description.

wisty · on Feb 21, 2011

IIRC, another thing which keeps processes open is database access.

I'm not sure, but I think that with Apache + Django, you get something like: Apache gets request, gives it to Django, Django asks MySQL for a value, waits, waits, waits (taking up lots of memory, but not too much CPU, but your host charges for RAM), waits a little more, gets a value from MySQL, puts it in a template, hands the response to Apache, which then sends the response to the client.