Hacker News new | past | comments | ask | show | jobs | submit login
Blazing fast node.js: performance tips from LinkedIn (linkedin.com)
95 points by diwank on June 24, 2012 | hide | past | favorite | 55 comments



Meh, this post was pretty unsatisfying.

Should anyone be shocked that you shouldn't do synchronous calls in an asynchronous framework? Or that you should use nginx to serve your static content and not Node? Requesting remote services in parallel is a no brainer. What does blazing fast really mean here? Using your framework correctly just sounds like "normal fast".


A good checklist of important items for optimizing performance should list the important items. That's what she did. Each reader can then skip over the items he already knows. It wouldn't be a better checklist if it left out important items on the grounds that some readers might already know some of them.


Well, she failed to mention "make sure computer is plugged in". Please, it's 2012, if you are using an asynchronous framework, doing things asynchronously should hardly be a surprise. And serving static content via a CDN or Nginx has knowledge order than dirt. This information would be fine if I was reading a college students blog, but this is LinkedIn, I expect something a little bit more. Maybe the next blog post will be on how to store salted passwords?


What does, "knowledge order than dirt" mean, exactly?

Are these types of errors a symptom of touch screen input, or apathy? I have noticed the occurrence has increased greatly over the last few months. I feel like a pedantic asshole, but it greatly impairs my ability to read comments fast, and it gives me the impression that code written by the same people would confuse me as a beginner/ hobbyist.


It's a typo for "older".


It's just a typo. These are known to happen and predate touch screens.


> Maybe the next blog post will be on how to store salted passwords?

Given the number of hacked password databases that weren't hashed and salted, this might actually be useful.


Yes, and LinkedIn was one of those databases. Hence my joke.


Node is currently v0.6.19--it's in it's infancy and people will find some of this information useful. If you don't find it useful, don't upvote it. Comment if you disagree with her points.

For what it's worth I'm working on a project that uses Node.js and took something away from this article.


> If you don't find it useful, don't upvote it

I didn't

> Comment if you disagree with her points

That would be what you're replying to now...


"We worked hard to make our mobile apps fast. Try them out and let us know how we did: we have an iPhone app, Android app and an HTML5 mobile version."

Yay they made loading fast apparently. But any user of the app will notice that the linked in app feels slow and clunky. Great shit loaded. But it still feels slow. For example, take an iPad and go to Flipboard. Rotate the device. Notice how fast and smooth it is. Now go to the news feed in the iOS LinkedIn app and Rotate. It's slow. It feels like a web browser. It feels like they spent so much time optimizing the loading of resources and spent ZERO time actually making the client feel nice and native.


Compared with the web based apps we had in the past the LinkedIn one is a great step forward. The main point according to me is not the fact that web vieww apps are slow, for sure things will improve in the future and web view perofrmances will increase in the time thus reducing the performance impact in most apps. The main point that should be discussed is why these guys insist in using html and js. People seem to forfet where html is coming from: it was a way to layout a page on screen. The reason why html acquired "desktop application like" capabilities was to provide some desktop features to internet based apps: at the end the clear winner was html5+js+css, while java applet failed and flash is its decline. Looking at the efforts done by LinkedIn engineers, you may appreciate the hard work they did and the interesting way they faced several challenges, but what surprises me is why those guys didn't follow the Obj-C approach that would have given to them the possibility to reach better results in a more natural way. According to me html should be used only in those cases where it is superior to Obj-C: publishing. It offers much more capabilities than PDF at lower cost. All other usages are wrong. Someone may anseer that LinkedIn preferred to follow the html based approach to leverage their investment i. The web site, but I will object that: 1. Html5 is the only choice for the web in the computers, which are not mobile devices, but it's a poor choice for all mobile devices that have a native development environment 2. The effor they did in fixing all webview issues could have been moved to do a better API for native apps. And yes, if you compare their app to Flipboard you notice the difference.


I like how the 10th tip is completely opposite of the 1st one.

2 is just a bad tip. Socket pooling is a useful feature. Opening up more connections than you (or the server) can handle will lead to poor performance.

3 and 5 isn't Node.js-specific and you're better off reading https://developers.google.com/speed/ to learn the fine details.

Instead of following 1, 8 and 9 you should learn how to profile your application so you can find the true bottlenecks of your app.

4 and 7 are just weird. Going session-free or switching to client-side architecture are both big tasks; I'd love to hear more about the real challenges, not just "this is fast, zomg!!"

I wish this post had more meat. How did they end up with these tips? How did they profile? How did they benchmark? Where can I read more about the other tips? Without any good links to learn more about the details (seriously, linking to gzip.org was the best you could do?) I'd say these tips can almost do more harm than good…


I'm not sure how the 1st and 10th tips are opposing. The first is to avoid blocking code, and the tenth is to keep the codebase skinny. What have I missed?

I'm not sure how #4 is weird. It can make sense in certain circumstances, as template rendering can be quite expensive. Making this change, though, without data from profiling is probably unnecessary.

Although you didn't touch on #6, they probably should have explained this in a little more detail.

It does sound like they made the decision to use binary modules as a result of profiling, and the fact binary modules exist might be useful info to a node n00b.

This article should probably have been presented more as food for thought, rather than Node's pseudo ten commandments for performance, but nevertheless, it's a useful article for the most part. Developers (well, seasoned ones at least) know to apply critical thinking to all advice anyway.


Practice what you preach. Passive-aggressively sniping at asynchronous code by implying it can never be small or light is not a valid argument. There is no "meat" to such a subjective position.


Callback-based code will always be more complex than synchronous code (and I believe that's an objective observation), but please let's not go there.

My remark was against their "Keep your code small and light". Making your code lighter doesn't always make it faster. If you can get your code to do less (like, skipping memory-based sessions), that will make it faster. If all you're doing is avoiding complexity, there's no guarantee it will be perform better.

The proper way is to benchmark and profile. You can refactor your code based on developer happiness, but only try increase its performance based on objective data.

EDIT: I should probably clarify this (although it seems silly to have to do this on HN): I consider callback-based code yet another tool for solving problems, not something you preach for/against. There is a time and place for callback-based code, just as there is a time and place where it's not a good idea.


Inside of an asynchronous framework where every call blocks all others (eg Node.js), callback-based code is almost always going to be the right choice. If you can't immediately produce the answer, don't wait. And if you can immediately produce the answer but it will take a while, break it up into multiple smaller computations so that you do not block.

If you are in a different environment, the arguments for/against become much different.


> I consider callback-based code yet another tool for solving problems, not something you preach for/against. There is a time and place for callback-based code, just as there is a time and place where it's not a good idea.

Node doesn't really give you much of a choice. If something is going to take awhile then you will be using callbacks, it's that simple.


It's interesting to note that while Twitter has moved away from client side rendering of HTML templates (and is a strong advocate of sending HTML directly down the wire[1]), LinkedIn mobile app seems to be rendering the template on the client.

[1]: http://engineering.twitter.com/2012/05/improving-performance...


Just because client-side rendering did not work well for Twitter for a variety of reasons, does not mean it is subpar to server-side rendering in all cases. Performance-wise, if server_json_generate + json_download + client_render_json_to_html < server_html_generate + html_download + client_render_html, you can certainly go with client-side rendering.

In Twitter's case, client-side rendering was slow while server-side wasn't. In LinkedIn's case, html_download was slow while client_render_json_to_html wasn't. Performance aside, there are numerable reasons to go either away, all of which vary from project to project.

Having Worked with CoffeeScript + Twitter Bootstrap + BackboneJS/Underscore + LessCSS for some time now, I know I will not willingly go back to the old server-side ways unless there is a major reason to. Working fully client-side to make UI and relying on REST API / JSON is wonderful experience, especially when the app feels so responsive.


What about SEO? I've noticed a lot of these sites that are rendered client-side start with very little content in the HTML, doesn't this mean the search engines can't read the content?


According to this other LinkIn blog post, http://engineering.linkedin.com/frontend/leaving-jsps-dust-m... (look for SEO in the comments), they detect cases where the client can't or won't do JS and perform the rendering server-side.


How do they do that? I can't really figure out anything apart checking user agent (for web crawlers?)


you can simply check for the accepts header.


Can we assume that degradability isn't a concern for you? Accessibility?


Degradability is not a concern. Accessibility is, and client-side rendering does not mean no accessibility.


I'm a bit bemused that for me, I can't think of an area where LinkedIn is anything more than a straight HTML site. That is, why would they need something like this?


Why would they need a template engine? Erm.. because they have more than one user....


I mean to say that there doesn't seem to be any apparent use of node on LI. Is there a chat facility or something?


Very interesting to see how just a single sync file call can destroy performance.

It makes perfect sense in hindsight, even though at the time it could be easy to think, "how much could this really hurt?"


"Our initial logging implementation accidentally included a synchronous call to write to disc."

You really want to avoid synchronous calls all of the time in node.js unless you are at a point in the app's life where it makes sense to block if necessary, such as on startup or shutdown.

Since this call was used in logging my guess is that the synchronous write call was called a lot and was easy to spot because of it. There are other places where calls like that would be harder to track down. As with any programming environment it is important to understand how things are working at least one abstraction level below the code you are writing.


You could think "how much could this really hurt?" But on the other hand, how much would it hurt to take the safer route and use the async function? I don't see why you would choose to use the sync function where it didn't belong. The documentation even warns against using the sync functions and states "synchronous versions will block the entire process until they complete--halting all connections."


"Don't use Node.js for static assets"

Obviously if you have node doing less it's going to be "faster". Or is it? I'm not really sure that offloading tasks qualifies as being faster. I'd like to see some data on this though. Basic caching as supplied with connect should serve static files pretty damn fast I would think. Perhaps nginx does that for a living and kicks ass, but node shouldn't be all that bad.


I'm interested to see at what point the speedup happens. In other words, is offloading static files faster because:

-Node has more cycles to compute things?

-The server node's on has less contention to deal with (disk, network)?

-Putting statics on another server allows the browser more simultaneous connections when loading the page, so it loads faster by default?

"Don't use Node.js for static assets" is an interesting observation but I'd like to know exactly why.


All of those things seem like they would be true. But even on a very basic implementation you shouldn't have to hit the disk all that often unless you have an extremely large number of static files to serve. This is quite possibly the case for many sites, but I think for most sites, all of your static files could be cached with a minimal memory hit.


Because C is faster than JavaScript, and nginx is written in C.


The reason it is advised to not load static files via Node.js is that nodejs is a single thread process, and static files will hold up this thread. Nginx on the other hand, is multithreaded, and can continue serving other assets and requests while the statics are being served.


Express supports sendfile(), so it should be about as efficient as any other Web server. I wonder if anybody has benchmarked it.


Nginx has a lot of tricks to handle this for you, including caching and working over multiple cores. There is also less framework overhead in Nginx because it's not trying to do dynamic content. I'd be surprised if Node beats Nginx for serving static content, but alas I do not have any proof. At the very least it will use time in Node that is better off serving dynamic content.


Hmm, web servers differ dramatically is serving single files.

The approach I favour is to put varnish in front of a classic slow script webserver (tornado, but use node if you prefer).

Tornado has this neat way of baking-in versioning to static content so clients (and proxies) can do aggressive caching http://www.tornadoweb.org/documentation/overview.html#static...

Best of both worlds.


I'm curious, is that an asynchronous sendfile?


Yes.


as in spinning off a thread to serve it?


I would love more insight or experiences on going stateless/ session free. Are you explicitly signing every request that requires authentication for security purposes?


Another tip for people using express is to not use middleware that is not needed in every request. You can specify special middleware for every path. So you can even split session usage only on pages that need it. Similarily, cookie parsing, etc can be avoided where not needed.


I was expecting they went the extra mile and implemented aio in node. Nothing is out of the ordinary here.


I found their use of Steps interesting. The syntax seems a bit unconventional but helps avoid nested callbacks.


Be sure to check out IcedCoffeeScript (if you use CoffeeScript) or TameJS as well. You might find them useful.


I was tempted to make a joke about LinkedIn and their passwords but this article is nice.


Careful with one of the tips... Not everything should be speed focused...


Why would a company that uses Java even consider Node.js?

Java is hands down faster.

I have never seen a benchmark that showed Node coming even close to Java performance.

http://www.youtube.com/watch?v=bzkRVzciAZg


It's funny, but I find it ironic you are using that video opposite to what it is intended. Saying "Java is hands down faster" without proof isn't any different than saying "Node.js will run circles around Java" without any proof.



It is because Java is like PHP. it is synchronous. If your app is doing a lot of non-cpu stuff, it makes complete sense to use an async programming model.


Nice little zing about Eratosthenes. http://www.youtube.com/watch?v=bzkRVzciAZg#t=238s




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: