This is our first follow up to last week's web framework benchmarks. Since last week, we have received dozens of comments, thoughts, questions, criticisms, and most importantly pull requests. This post shows data collected from a second run on EC2 and i7 hardware that started on Tuesday of this week. A third round with even more community contribution is already underway.
Thanks especially to those who have contributed! We hope this is useful information.
You have no idea how valuable this is to everyone! I know it takes a lot of effort to consolidate all comments, requests, fixes suggestions, etc. Personally, I've even seen you respond on the Play! framework Google groups.
Thank you for being such a down to earth person and helping out the community. You guys rock :)
Thanks so much for the kind words. We're obviously having a great time working on this, and we too think there's a lot of value to this for the community at large.
It's great that you're doing this, and listing stuff like standard deviation in the tables -- but I'd say your focus/interpretation of the data isn't quite right. At least provide the option to sort by standard deviation -- as that might well be more interesting than requests/second?
Maybe I'm just being mean because I was reminded of this essay by Zed Shaw earlier today (I was looking for his alluded rant on CC licenses, which I didn't find):
> In this week's tests, we have added latency tab (available using the rightmost tab at the top of this panel). On i7, we see that several frameworks are able to provide a response in under 10 milliseconds. Only Cake PHP requires more than 100 milliseconds.
Only cake php requires more than 100 milliseconds on average. But look at Django: Average around 60 ms, standard deviation around 90 ms (!). Not to mention a "worst" score of 1.4 seconds.
Thanks for the suggestion, we're still in the early stages of getting the latency information incorporated, and having the ability to sort by the various metrics makes a lot of sense.
As one of the people complaining about statistics last week (and also, by coincidence, citing Zed's rant), I'm glad to see you are working on it and open to more ideas and improvements!
Also, I like the "sportsmanlike benchmarking game between different communities" vibe I'm getting from all this.
Would be nice if the community helps turn this into the de facto example of how to benchmark correctly.
Now I'll just have to wait and see how Go 1.1 compares ;).
Go 1.1 is looking very strong! Pat (pfalls) just showed me some very preliminary numbers and we are extremely happy with them.
Thanks for the input and constructive criticism, vanderZwan. It has been very helpful to get feedback from yourself and everyone else. I too am particularly happy with the sportsmanlike competition vibe. You have no idea how fulfilling that is to us.
First time I'm actually happy for building something with Servlets (have a feeling I'm the only one here that uses it as the go to framework for hacking something quickly, I'm an old man), it's indeed dead annoying for REST services, but doable, and blazing fast. The question is - when does my productivity starts to be more important than performance (I think 99% of the lifetime of anything I'll ever build is the first and not the latter).
I'm very surprised to see Play-Scala is not in the same playing field as other JVM based frameworks, had a lot of hope for it, I hope TypeSafe will take that into consideration...
Node.js is the biggest surprise for me on the good side, I think it changes my plans a little as for what to learn next...
The contributions from the community have been great. To those who submitted pull requests that didn't make it into this blog post: We've been overwhelmed (in a good way) with the response and we're working to include as much as we can. Thank you!
I'd be curious how it would perform on C# (windows/.net vs. linux/mono) .. though the environment setup would probably be a bit more involved.. and IIS is a very different beast.
I'd think that it would probably land somewhere close to Java Servlets, but a bit slower. The framework stack for web requests in .Net is probably a bit more than it is in the servlet server in question. I would also think that Mono would be a bit slower than IIS, only because IIS does very well at pooling resources/threads for multiple requests.
There's also the question of async .Net handling vs. blocking. Most .Net code I've seen is blocking, but there are async options, and as of the 4.x releases are much easier to use.
Thanks for the link, errnoh. You're precisely right, we want to have C# and .Net tests included but didn't find the time to do so ourselves in the past week and have not yet received a pull request. I was not familiar with ServiceStack prior to feedback we received to last week's test, but from the looks of it, I'd personally like to see how it does.
Did you guys turn on byte code caching for all the PHP frameworks? If not, then I recommend everyone ignore these benchmarks until that is completely done.
So, they've put a ton of effort into this, have been very receptive to community feedback and criticism, and have followed through with a major update to the whole thing. And then you come in with a drive-by recommendation for everyone to ignore the whole thing because of your unfounded and incorrect assumption. Nice work.
Having max/min spare servers as the same amount is a bad idea. This is incurs a substantial amount of process swapping, as every single request php-fpm is going to try and ensure there are precisely 256 idle servers.
Ok, help us out: Given that we have Wrk set to max out at 256 concurrent requests, what would the ideal tuning for php-fpm? A pull request would be ideal, but you can also just tell us. :)
I'd set minimum idle to something like 16 or 32. php-fpm will not create more than 32 workers/sec.
What happens now is 256 workers running and 256 simultaneous requests occur. So php-fpm sees 256 workers busy, 0 idle. The minimum idle is 256, so it attempts to start 256 additional processes.
I could be missing something, but it looks like you are using the default settings. Have you tried tweaking it all? Specifically setting apc.stat=0, which will stop it from checking the mtime. You'll need to clear the cache with apc_clear_cache() when you make code changes though. You may also want to look at apc.php to check for fragmentation and adjust apc.shm_size if necessary
This is still a json API comparison, not a full framework comparison.
In the first test it sees how fast frameworks can serialize json and in the second test it sees how fast frameworks can access the database AND serialize json.
An inefficient json library and a framework is doomed across all tests. There are a lot of things frameworks do beyond serving json. These benchmarks are presented as representative of broad use when really they are only representative of the frameworks if used as a json api.
A raw HTML hello world, and removing the json encoding step in the db test would go a long way towards getting results that can mean what the authors seem to want them to mean.
There's nothing wrong with a json API benchmark, in fact its quite valuable in a lot of cases. But if that's what this is intended to be, the authors should say so.
Agreed; for example, in our (in-house) use case, switching a django app from using either stdlib's json or simplejson to using ujson[0] was a significant performance increase when serializing/deserializing large-ish (~100MB) JSON datasets.
As an amateur hacker and after 275 days of reading Hacker News I feel I can now navigate in the sea of client side Javascript frameworks. With the introduction of this data on web framework performance, I now have another set of choices that I am completely unqualified to comprehend.
My first impression is that this data shows about 3 levels of web frameworks. At the bottom (slowest) we have Django's and Rails and many other introductory app server frameworks. Lets say after I was able to build an initial product successfully, would I then consider re-building the product in a higher performance framework as Go, Node, etc?
The third level, netty and gemini and servlet etc, I am not familiar with. Googling "netty" I get --"Netty is an asynchronous event-driven network application framework"-- I thought that is what Node is (in js) and Go does with gophers.
What are the use cases for these faster frameworks and do they follow an evolution of performance options that an app might go through?
Most of these are bound to a language: choose a language first (and my answer to that question is "whichever one you're happiest working in and can get the job done"), then figure out which backend to use. You can fret over the framework more after you've already built something that works.
IMO if you build a product and it is working, there really is not much reason to redo your efforts in a higher performance framework unless your needs require you to do so. If you are at your current hardware's limits and you can't throw more hardware at it (for whatever reason), it may make sense in that case. If you want to reduce the amount of servers you currently have, it also makes sense there. http://blog.iron.io/2013/03/how-we-went-from-30-servers-to-2... This is a great post talking about going from 30 servers using Ruby to 2 servers using Go.
Use case example - Twitter was using Rails on the front-end for a while but it got to the point where using it made little sense. Part of Rails' performance is dependent on caching - which is really hard, if not impossible, to do well when you are dealing with live data.
FYI, Gemini is our framework that we've built and maintain internally. You probably won't find reference to it (except for out blog).
I like these comparisons. It's unlikely that anyone is going to pick a framework for their project based on just this alone.
I find its good "food for thought" as it provides me exposure to frameworks that perhaps I can use one day but also to learn more about.
The most interesting aspect is all the code for each framework to do the simple page output, the queries, etc. can be viewed for each framework. I find this an incredible learning tool as it shows the personality of each framework/language.
"I like these comparisons. It's unlikely that anyone is going to pick a framework for their project based on just this alone."
This is exactly right, performance is one of many items to consider when building your own website, and while we're aiming to have a very comprehensive and useful set of performance data, we would agree that other factors need to be taken into account.
Netty, gemini and servlet are all java server solutions without a thick framework layer getting in-between. What you can tell from these benchmarks in my opinion is that java (or rather the JVM) has the best "raw" performance. Frameworks introduce a slow-down factor on top of the raw platform, and different frameworks have different slow-down factors.
You have to trade off developer productivity for performance when it comes to choosing frameworks. The slower the framework the more it does for you.
EDIT: this appears to be a different "Gemini" than the one I'm talking about here - in a comment below, it's mentioned that the tested Gemini is built and maintained internally. So the next paragraph is probably irrelevant.
Note that Gemini is an OSGi framework built on top of Spring DM. For those who don't know, OSGi is Java modularity "done right" - basically a very loosely-coupled module system (which Java lacks) on steroids.
Gemini is a full-featured web framework. We've built it and maintain it internally. It offers caching (though not enabled for these tests), lightweight ORM, bunch of default handling methods and plenty of other features.
In general, I echo ConstantineXVI's comment. However, I'm also curious to know where the definite line is between the bottom half and the top 5 on the table. At what point does a team go "We're getting hammered and our backend is the problem. We have to switch to X or Y because our API on Z has reached its limit."
You will know when it becomes a problem. The helpdesk gets flooded. Servers go down. You get white pages and 500 internal server error pages. The entire site folds because a mouse farts.
You add new servers and they get crushed. You scrutinize every line of code and fix algorithms and cache whatever is possible and you still can't keep up. You look at your full stack configuration and tune settings. You are afraid of growth because it will bring the site down. THEN you know it's time to consider switching frameworks. And even then I would try to find the core of the problem and rewrite that one piece.
Rewriting an app in a new framework can kill a company. Be leery of starting over. I personally know of one company who started over some 4+ years ago because the old app was too hard to maintain and only started rolling out the new app last year to extremely poor reception (even with about 1/2 the features of the old app). The reception was so bad that they had to stop rolling it out until it was fixed. They could have easily spent a year improving the old app and would be miles ahead.
That's not to say that all problems are due to the framework either. I've had web servers go unresponsive for 20+ minutes because apache went into the swap of death because KeepAlive was set to 15s and MaxClients was set to a value that would exceed available RAM. The quickest solution was to cycle the box. This was 10 years ago though and I think I had a total of 1GB of ram to work with.
This is exactly the sort of question that should be asked in the context of benchmarks like this. There have been a lot of good comments in response, but I just wanted to thank you for asking one of the best questions in this thread.
I think it's a misconception to think of Django and Rails as "introductory frameworks". These are production ready frameworks that can scale to meet a variety of (though not all) business needs.
My intention of the word introductory was not to feel like training wheels. Joeri in this thread says:
>>You have to trade off developer productivity for performance when it comes to choosing frameworks. The slower the framework the more it does for you.
And in that respect, the more the framework does for a developer, the better it is probably suited a starting point for new developers. It is a badge of success that Django and Rails can be able to serve a whole spectrum of needs.
When it comes to "learning" frameworks, I don't think the highest level of abstraction is necessarily the best. Some are better able to cope with black-box behavior than others. For those who feel the need to understand their tools as they use them, I'm of the view that the Sinatras of the world are a better starting point.
We've had a lot of requests Mono, and are actually very interested in trying this out. We've been hesitant to move forward only because we're unsure how stable the C# community believes Mono to be. We hope to have that test at some point though, but the fastest way to get a test included is to issue a pull request. So if anyone is interested in trying this out, let us know and we can help guide the process.
I agree, I would like to see C# Mono included as well. People might also want to see C# .NET, but I don't think that is applicable here because we are talking about 64-bit Linux (the dominate internet platform) here.
ok cool, was just a little vague, you should've mentioned 'Windows' as C# (even F# http://www.servicestack.net/mythz_blog/?p=785) is cross-platform and runs on OSX/Linux with Mono.
- Writing strings instead of buffers (they're copied, they aren't sent as-is).
- Using async. It does a lot of really nasty things, most of which break V8 optimization best practices. This is a perf benchmark, not a comparison of how concise your code can be.
- Having the main request handler in a gigantic function that will never be properly optimized by V8.
- Not getting helper functions that are clearly monomorphic warm before accepting requests.
While the code itself is what I would consider fine, when benchmarking against strongly typed compiled languages, performance concerns become important, even if you have to write ugly code.
On the other hand, shouldn't real code be tested? How accurate would a ugly-optimized-code be when you SHOULD write reusable code in your real life projects?
It's already not an apples to apples comparison. For instance, what's in the Node examples is much the use of frameworks built to enable rapid prototyping, not built to handle the needs of performance-oriented services. So we're comparing some optimized systems designed to be used in production versus frameworks designed to be terse instead of performant.
That's fair. Though I think anyone who just picks technologies based on these bar graphs alone is also missing the point, so some critical thinking is required to interpret these benchmarks, as usual.
>This is a perf benchmark, not a comparison of how concise your code can be (...) performance concerns become important, even if you have to write ugly code
That goes against the notion of benchmarking.
It's about measuring as well as you can how idiomatic code will perform.
It might be interesting to track average and peak server side memory usage during these tests. For example, if framework A is only 80% the "speed" of framework B, but uses 1/4 as much memory, for some users this might be a win for A.
This is definitely something we want to include in the tests, but it is difficult to manage a fair way of monitoring. Outside of the actual benchmarks, where we kept everything as fair as possible, we ran individual tests to make sure they worked prior to the benchmarking and would routinely have htop running at the same time to get an idea of what that sort of data looks like.
It IS very interesting and is definitely something we have discussed methods for measuring for the sake of these benchmarks. Other areas that we discussed were cpu utilization, disk utilization, and network saturation.
One idea that jumps to mind is putting the running frameworks in an LXC container (or similar) and monitoring the memory usage of that. Not sure how accurate that is, but it's one avenue. It also still might not be fair because some runtimes can let their heap get pretty big despite not actually needing the memory. You'd want a way to differentiate those from the ones with big heaps that would choke with some memory pressure.
A note about Compojure: they're using Korma for database access. Korma is an abstraction layer that's more Clojure-friendly than using SQL directly, but lighter than an ORM. Using clojure.java.jdbc might produce higher performance.
Where possible on the tests, we tried to use something other than raw SQL because we wouldn't expect someone to use raw SQL everywhere when writing a real application. We chose Korma because it seemed decently popular for Clojure, but we're certainly open to a "compojure-raw" test that uses clojure/java.jdbc. Feel free to submit a pull request. :)
I'm not saying Korma isn't a good or realistic choice. I use it myself. There did seem to be some disagreement in freenode #clojure about whether that was the best way to go, or JDBC and prepared statements. Evidently, a lot of people actually do like SQL.
It's just important for people to be aware that "Compojure" actually represents "one possible production-ready stack that uses Compojure as the routing component", as do several other items on the list. I don't mean that as a criticism - just clarification.
The performance of Compojure was really impressive, which was almost three times as fast as play-scala. I was expecting the opposite, considering that Scala is generally a bit faster than Clojure.
From what little I know about Play, it does a lot more than Compojure does. Compojure is really just a routing library. This is a benchmark of Ring (HTTP communication), Compojure (routing), Korma (database access/abstraction) and Cheshire (JSON serialization). I'm pretty sure a basic request in Play goes through more middleware than that.
But yes, the performance of a reasonable web app stack on Clojure is very good.
>what matters most to me is speed of development and whether or not I enjoy the process.
Matters most? There are 100x differences here. Take two companies writing the same application, one in Spring the other in Rails. Both have easy access to knowledgable people, both are industry standard. However the Spring application would be several orders of magnitude more scalable[1]. I'd consider that to be a more important consideration over enjoying the language.
[1] All benchmarks are suspect until proven otherwise. You can write slow software in Java/C/C++, you can write fast software in Python/Ruby etc etc etc
I think most people these days take scalability to mean the ability to handle more load by adding hardware without changing the architecture. That's quite different from how much load the same hardware can handle (performance).
I dunno, I hear the term "scalable" applied to business decisions all the time. I think it encompasses more than "Is this problem embarrassingly parallelizable?"
In business terms, what people usually seem to mean is "able to bring in more revenue without having to hire people with skill sets that are rare or difficult to qualify". A software consulting business doesn't scale because it requires hiring programmers; a telephone support business might because it's much easier to hire people who can speak English and follow a script.
From my experience, the Spring one would take 10x as long to make, which means more developer cost and more feedback reaction time. Certainly not a deal breaker, but a cost/tradeoff one must consider.
As someone who recently came back to Java from a 5 year stint with Rails, I can safely say this is not true by any stretch of the imagination. If you stick to those tools in the Java ecosystem which meet your requirements, you can be incredibly productive.
The problem is that the Java world is full of over-engineered solutions which are really targeted at extremely large enterprise applications surrounded by a community of folks who like to menturbate about them. As a startup, you don't need to focus on those solutions. You'll not need 99% of them.
However, you can choose bit and pieces. I'm using Vaadin, JPA and Guice and finding it incredibly productive and am able to deliver functionality in far less time than it took me in Rails. I'm not using 90% of the Java EE stack, but it's available to me if/when I need it.
Rails was a great thing for the software development world. It drove convention over configuration and caused a lot of the competing technologies to pause and question why things were the way they were. But IMHO, managing the whole Rails stack (rails + coffeescript + css in our case) across a group of people became a chore. YMMV, but the statically-typed nature of Java along with the very capable technology stacks you can assemble have eased our daily jobs tremendously.
I use Spring every day as part of my day job, and I agree that a lot of the development time slowness is due to culture/enterprise. But that is a serious problem that can't be readily dismissed.
So far, so good. It's very different from your typical template-driven approach. We have a decent amount of Swing experience, and developing with Vaadin feels similar (but much easier) than Swing.
If you are developing business-facing applications and not just consumer web sites, as we are, I think Vaadin makes a lot of sense. It might even make sense for consumer sites, but that's not our current target.
I should also say: putting together a new screen (form or otherwise) in Vaadin is extremely fast. It's the fastest I've ever been able to develop web-based UIs, simply because all I'm forced to think about is Java.
I see. I found it to be a little difficult in terms of dealing with data, especially messing around with the Table object. Best of luck to your project.
Do you have actual experience with developers of similar skill levels building a similar app in Spring and Rails?
While undoubtedly less expressive I think most of the bad rep Spring gets is because it's used in "Enterprise", which means programmers that are, if not of lower level, constantly interrupted and yanked around and told what not to do.
I wouldn't choose Spring anymore because there are better alternatives, but I'm pretty sure the 10x is in the programmer and not the framework. The framework might be 2x, if that, but not much more I think.
P.S. If you include Spring Roo then I'm pretty sure the 10x no longer holds, but I never got up to speed on Roo myself so I can't say anything authoritative on that.
Oh sure, there are reasons to use an easier/faster-to-develop language, I believe every developer should know a 'hard-and-fast' language and a 'slow-and-easy' language... and know when to use them.
98% of the websites out there could run on python and I doubt we'd see a big performance difference.
However, I really don't think there exists a 10x difference developing in Spring. Hell, I don't think there is a 10x difference in using Struts vs Rails.
Spring development is considered slow as it brings images of enterprise shops taking several years to go through the bureaucratic red tape necessary to incorporate a new feature.
I've worked with plenty of small companies using Spring and their development/release cycle is maybe 1/2 or 1/4 the speed of comparable 'easier-to-build' technologies, but it definitely isn't 1/10th the speed to develop.
I worked at a start-up a few years ago. I worked on a PHP-powered website that spoke to a Java-powered (Spring/Hibernate) service layer. The PHP team consisted of one: me. The Java team consisted of over 10 engineers. I outpaced them easily and consistently.
In other projects with similar divisions, I've had similar experiences, though not always so dramatic.
I used to be a hired gun that would be called in occasionally to, like you, code circles around bigger teams when deadlines were tight.
Last few years however I decided to stick around with one of my clients for a while and I realized that quite a few of the employees I considered sub-par before were actually pretty decent programmers doing their best to do quality work in a dysfunctional environment.
I've since then learned to really appreciate this quote, which my gut tells applies to your situation as well:
"Never attribute to incompetence that which is adequately explained by bad management."
You're exactly right. I hope my comment didn't come across as critical to the engineers I worked with. Some of them are the best engineers I have ever worked with, and I have great respect for them.
Hm, I'm not sure a benchmark like these allows one to draw such conclusions.
The benchmarks, although good, are a best case scenario and of limited aspects of those frameworks. Real applications don't behave like this. What if on the average case such huge differences in performance becomes eclipsed by other factors?
The reason why I don't like to see this same argument somewhere at the top of every single performance related thread is that it is not a law of physics that faster languages/frameworks are necessarily less productive than slower ones. It's not even an empirical truth.
If someone from my team suggested to use Cake PHP I would say no without even looking at it once. Not because I would never trade performance for productivity, but because I simply do not believe even for a second that productivity gains are the reason for that kind of horrific performance.
I 100% agree with this. It has been a huge point of discussion for us here internally. "Developer efficiently with language x." Ultimately, language performance does matter but there are ways to make up for less performant languages. If it takes you 2x as long to get something out the door though, then that is a huge problem.
Just use the language and framework you are most efficient in and enjoy using. Worry about framework performance when you notice server load increasing over time.
Sure... as long as you have the migration and rewrite plan done and ready to go, and have engineered the product for that rewrite. Otherwise you're gonna be wishing you'd failed rather than succeeded.
It seems like you're still using Mongo DB as the datastore for some languages and Mysql as the datastore for other languages. This seems like it would bias the results.
There will most certainly be a Round 3, and 4 and on and on as long as we continue to have feedback from the community. I can guarantee that we'll have a Lua test in there, but if we receive a pull request, we'll include it.
This is an awesome project, and thank you for the follow-up.
In particular, this is awesome because it's introduced me to some new frameworks that I hadn't even heard of (ie vert.x) that seem extremely interesting.
Thanks for the kind words. Glad to hear that you've been introduced to some new frameworks, we feel the same way, and we're excited to see what else the community has up it's sleeve.
We have one pull request for Onion, which is c based. We're working with the author to work through some issues, but we're also excited to see how it will stack up.
It is worth pointing out Go 1.0.3 not Go 1.1 beta is being used. I tested the code locally and on my machine 1.1 beta was about 20% "faster" (req/sec).
In addition to the improvements from the improved scheduler, use of accept4 on Linux, better GC and code generation, Brad Fitzpatrick has been giving a lot of love to the "net/http" package. Here is a small example: https://plus.google.com/u/0/115863474911002159675/posts/L3o9...
This is definitely motivation to learn another framework besides Rails and Django.
Compojure's syntax and design seem really attractive... is anyone using that in production? Would it be a reasonable choice for a "serious" web application?
I've used it for client work that was in production and for a side-project that's in production, but private with few users. It has been reliable for me.
Note that Compojure is more of a routing library than a framework. It's similar to Flask or Sinatra. Clojure's philosophy includes aggressive separation of concerns such that full-stack frameworks aren't really "the Clojure way".
That's the first thing I noticed. What was used instead of Warp?
I also see that Snap got pulled in a couple days ago, so it will probably get tested in the next round. Should be interesting to see how it fares against Yesod.
The way the benchmark is constructed, it's really more about JSON and DB performance. Snap makes no decisions for you about those things, so the numbers really won't reflect the performance of Snap all that much. I don't have great expectations for the "Snap" code that is in there now because it is not using very fast JSON and DB libraries.
Actually, the performances of Yesod are not so bad, as Play's are.
Top performing frameworks (netty, vertx, go) are not web frameworks but async programming frameworks. They don't carry a lot of things around -- just push the serialized data on the raw socket output stream. Yesod/Play/Django/Rails/... on the other hand must manage routes, content types, browser headers, etc etc....
What I - again - find most interesting, is the large discrepancy in the 20-query test. Common wisdom seems to be that language performance is not important b/c every language hangs in the same IO/DB.
My best guess is that the java one is slow due to ebean (I've always seen huge performance improvements from switching to raw jdbc). The scala one I have no clue.
More likely because the version uses only as many threads as the cpu has cores. Whereas the servlet version for example has 128 thanks to the default resin configuration.
Gemini conveniently leads the pack all the time. So either OP and his/her colleagues have revolutionized web programming or the benchmark is nothing but a PR piece.
Just ignore Gemini. In fact, you can click the "hide" link on the charts to remove it.
From our first benchmarks post: http://www.techempower.com/blog/2013/03/28/framework-benchma...
"Why include this Gemini framework I've never heard of?" We have included our in-house Java web framework, Gemini, in our tests. We've done so because it's of interest to us. You can consider it a stand-in for any relatively lightweight minimal-locking Java framework. While we're proud of how it performs among the well-established field, this exercise is not about Gemini. We routinely use other frameworks on client projects and we want this data to inform our recommendations for new projects.
I don't know about Sinatra or anything else but I can give you the reason why Flask out of the box performs like it does. Unlike PHP Flask is threadsafe and uses locks and thread or greenlet local storage. This level of indirection adds a basic performance penalty when executed on a single threaded server like gunicorn. There are ways to change that if you think that becomes a performance problem by replacing the stack object with one that does not do dispatching and by removing the locks.
This in theory is easy to do, it's just generally not a real issue since most Flask apps will actually wait on database more than they waste CPU time.
Additionally Flask has a huge request handling overhead which will be primarily noticable in dummy benchmarks that don't do much in view functions. If you have an application that actually barely executes anything in the views you can subclass the Flask object to remove some of the functionality you normally always pay for (request processors, session initialization, message flashing etc.)
Lastly Flask has some sanity checks when Python is not run with `-O` that simplify debugging. Depending on which code paths you hit you might be paying for some of that.
In the database test the primary slowness is most likely not Flask but the way the database is configured.
Generally if you ever encounter Flask to be an actual performance problem I am more than happy to point you to ways to get more speed out of it. Generally speaking though there is not much point micro-optimizing until you notice it is a problem.
> But the other database tests were run against the same DB server and configuration. How can Flask's performance be the DB's fault?
They are written against the core database layer or the framework (in case of Django the Django ORM) and in case of Flask: Flask-SQLAlchemy. The default config of both of these things is to not pool connections. That is not true for all other frameworks.
Also I'm not blaming anything here, just pointing out that there are differences in the setup.
Why shouldn't that be the case? PHP has been designed from the get go to be super fast and 5.4 introduced a bunch of optimizations that have sped it up even more so. Ruby was designed for programmer happiness, PHP was designed for performance and getting sh*t done.
> PHP has been designed from the get go to be super fast
PHP was never designed to be fast. As far as execution speed goes, PHP will be in the same ballpark as Perl?Python/Ruby. PHP comes up as faster in these tests because of all the extra things the frameworks are doing. See the_mitsuhiko comment above regarding flask for an example.
Not sure about ruby, but django doesn't ship with connection pooling, so that is probably why its database performance is terrible. It could be configured to use pooling.
With such a trivial request handler like the one in benchmark, only a small percentage of the execution is actually taken up in the view, so all of the time you're seeing in the benchmark is pure overhead of the framework. A longer running view across all of these frameworks would cause things to even out a bit. All that being said, however, it's not an unfair comparison. After all, a benchmark of this nature should be a measure of overhead because that's really all there is to measure.
Correct, because we chose MySQL as the database backend, both Django and Flask are punished for not having a connection pool. We've received requests to do the test using Postgres which does have a connection pool, and we hope to eventually get to that.
1. Blame me. I put together the charts and tables. I'll try to get some sorting added for the next round! :)
2. Pat (pfalls) should know the answer to that more definitively, but we are definitely trying to use the CPU cores as fully as possible. I think you'll see some comments by Pat elsewhere in this thread suggesting some things coming in round 3.
3. Good question! Mostly because we haven't had the time to add it. I would really like to see it added, though. If you know Go and can write the code, might I entice you to submit a pull request?
The short answer is that at some point, we kept adding more and more tests on our own, which kept the blog from going out, and we finally decided to have a cutoff date and let the community continue adding tests that they were interested in. I think adding a database test to Netty (as well as Go for that matter) would be a great addition.
Since we moved to a different benchmarking tool for round 2, we had to re-run all the tests, which is a time consuming process, especially on the EC2 hardware. Now that we've gotten that out of the way, we have an easier time running and processing results for individual tests. So stay tuned!
I wonder where the bottleneck for the ruby benchmarks is. Is it in passenger, the framework, or somwhere else? I am sure I have gotten higher requests per second on my workstation when I tuned Ramaze (which should not be faster than using raw rack). My guess is that it is either due to a different method of measuring or the fact that I was using thin rather than passenger.
I think I said something similar on the first round, but for the Python tests, I'd be curious to see alternate runtimes and/or concurrency strategies: pypy, gevent, and probably the confluence of the two (though it requires quite a bit more hacking, since gevent doesn't officially support pypy).
I think that the play-java one would likely benefit greatly from using raw jdbc rather than the ebean ORM. In fact my best guess is this is why the scala/java versions are so far apart. In past projects, scrapping ebean always led to large performance improvements in throughput, etc.
Cool stuff. Only looked over it quickly but it seems JAVA stuff is dominating.
Kind of curious why jruby doesn't provide more of a boost (especially given compojure doing relatively well, too) and would like to see Django+Jython in version 3 (and +1 to the .NET/Mono stuff)
Interesting that the Play code optimizations had virtually no effect. Almost identical absolute scores between tests. Clearly something heavy's going on within the Play request handling framework to slow things down, not the code we see.
I was also surprised at the difference between the Java and Scala play test, since I thought they were supposed to be similar. But it looks like the approaches are quite different. The Java DB test uses the ebean ORM while Scala does not. The JSON code also looks a bit different, with the Java version surfacing more Jackson internals. I don't know if these are necessarily "wrong", but perhaps it's not as apples-to-apples a comparison as it could be.
Agreed, and I would certainly like to see the Play tests (both Scala and Java) improve versus what we have measured so far. I would not rule out a configuration glitch in our deployment either. But on that front, I'm really hoping the Play experts can lend a hand. I think we've received a couple tips about the database connection pool size.
If the contributions we've seen so far are any indicator, we're going to need to get more clever with how to show and hide rows in our results tables! :) But I'd like to see a few more rows added to cover the various permutations of the ORM and JSON options, as you point out.
I'am certain they didn't test the versions that got merged the last two days (for example the scala version has a working db test now). So we will probably see some improvements in the next version.
The scala version uses val, which ist like final in java. final enables the JVM to cache this object an run optimizations. So I suspect the object creation only happens once in the scala case, whereas in the java case on every request. I think this is happening here which results in much better performance for the scala version.
But maybe I am wrong and the difference stems only from implementation differences in the controller etc. I'm a php guy and don't know the jvm or scala very well. ;o)
As one would expect, as more data access is added, the scale of differences between web frameworks goes down. It would be interesting if you added 5 and 10 db query test runs as that starts to get closer to what a functional website would be doing.
We do in fact have 5 and 10-query runs available in the data. Switch the views to "Data table" or "All samples (line chart)" and you should see the data for 1, 5, 10, 15, and 20 queries per request.
Go 1.1 is very high on our list, we'd love to get that into the tests as soon as we can. Of course, anyone in the Go community is welcome to issue a pull request that updated the version to 1.1.
Where's the source for the implementations? Of interest to me is the NodeJS instances.. was this single-process, or multi-process via cluster (which is recommended for higher performance). The only source I saw was for the testing framework.
Seems to be using cluster. I'm not surprised that Go and servlets were faster... just surprised to see NodeJS in the middle of the pack position. Though performance isn't the only reason I really like NodeJS
We accepted a pull request for an Erlang framework yesterday, so they will be in the next round of benchmarks. If you want to ensure your favorite is listed, you can issue a pull request yourself on our github page.
One of the variations that is being tested here, that I'm not sure if it is legitimate is that for PHP specifically, Apache is likely spinning out many PHP threads, I'm not sure if the same behavior is happening on the ruby side. Thus, if you are comparing performance of 10 PHP threads to 1 ruby or python thread, your benchmark is not going to be terribly accurate.
We're looking into moving to unicorn as the community has suggested. If anyone is interested in setting up Puma or Goliath, we'd be interested in testing those out as well.
For something like the query tests where you're IO-bound, Puma is going to annihilate the competition. For CPU-bound tests like JSON generation, unicorn with multiple workers is going to perform better.
It's worth noting that Rails' default JSON solution is the compatible-everywhere pure-Ruby JSON generator. Using a drop-in replacement like the oj gem will drastically improve throughput there.
I didn't get to the pull request for this round, but tweaking the GC parameters for the Ruby install should dramatically improve the general performance of the non-JRuby Ruby tests, as well. I'll see if I can get a PR in. :)
I guess that's one of the big advantages JVM languages have. Using stuff like Spring or Hibernate (like Grails does), you benefit from thousands of person-hours spent on tuning those libraries and you still get native Java performance.
Obviously there's a price you have to pay for all the things Grails built on top of Spring/Hibernate (in the benchmark it looks like Grails' performance = Spring / 2), but in general, it's still faster than just writing everything in Groovy (Ruby, Python, ...).
Yup. I manage a Grails app where I wanted to speed up performance on a popular part of the app. So I replace the Grails controller with a servlet. I saw a bump in performance.
Is it against the rules to use different workers for Django or Flask? In other words, there's a big difference when I run things with Gunicorn+Gevent+psychopg (PostGres async driver) than when I run it just with Gunicorn alone.
I wish you would have tested minimalist web frameworks in Python: CherryPy and Bottle. Flask is not really a micro-framework in the same way that CherryPy and Bottle are micro-frameworks because of its numerous dependencies.
I know this will sound like a refrain, but would you be interested in preparing a test for Node that uses a better ORM? Or if you can demonstrate that another ORM would be clearly superior, perhaps it would make more sense to simply modify the existing Node test code, replacing Sequelize?
Maybe, maybe not, that depends on your priorities- If your concern is (or a major concern) is performance, then yes, you should reconsider (maybe look at Go?). If that is not a major concern, maybe not. (I personally am not a Ruby fan due to performance and security issues, but it works well for many people)
I like to see that Tapestry is rising high in this latest round. I have a lot of production sites running Tapestry and I'm always impressed with how fast it runs.
We got a pull request for an Erlang framework (my brain is not letting me pull up the name) yesterday that we accepted, but sadly it was too late for this round of benchmarks. We will include it in the next round.
Disappointing answer incoming: we haven't found the time. If you know Go sufficiently to write those tests (they should be fairly easy to write for someone who works with Go regularly), we'd be very happy to accept a pull request.
Also, just to reiterate something I've said elsewhere. We've done some spot testing with Go 1.1 and its performance is super. We really look forward to showing Round 3 numbers next week!
Don't thank us, we didn't write the code for it - one of you (the community) wrote it and submitted a pull request. We just ensured that the test would work in our benchmarks, then ran it! We are happy to do it, and super appreciative of the support from you guys!
While it is true that our source code is closed at the moment, you are still free to check out the benchmark code from github and run it yourself to verify our results.
It wouldn't really be a fair test if we put all these frameworks up for testing, suggested Gemini was better than sliced bread with auto-applying jam, and didn't back it up in a testable way ;-)
Thanks especially to those who have contributed! We hope this is useful information.