Broadly speaking people on HN have no clue how to setup a performant httpd/app server and are impressed by abysmal performance/cost metrics like this or the MangaDex post. Everything these days is obscured through multiple layers of SaaS offerings and unnecessary bloat like kubernetes.
~10k rps (it was concurrent connections but close enough) was state of the art in 1999. Now 22 years later ~50 rps is somehow impressive.
In my opinion it isn’t so much of a lost art or lamentable accretion of useless abstractions, but an increase in the scope of what web apps do these days. Most of us aren’t working on static sites or simple CMS publishing- those are a solved problem. Instead we’re building mobile banks, diagnostic systems, software tooling, 3D games, and shopping malls. The complexity is inherent in the maturity of the web and it’s many uses, as well as its global scale. Hugs of death are rare these days thanks to better architecture and infrastructure, even though the scale of users has grown 100X.
Yes there are wildly unnecessary abstractions that are used for small sites/apps, but I would contend they are artifacts of someone who it’s trying to learn something new, and/or get promoted. I have no problem with the former.
> ~10k rps (it was concurrent connections but close enough) was state of the art in 1999. Now 22 years later ~50 rps is somehow impressive.
I honestly don't understand how that can be true. I'm not suggesting you're lying of course, but when you put it this way it's almost like people are actively trying to slow their programs down. I have a few ideas on why that might be the case (switch to slow interpreted languages, switch to bigger web frameworks, bigger payload) but even that wouldn't explain all of it. Do you have any idea why things are this way?
This is just basic use the right tools for the right job 101. You've got what is basically a static website. You want to serve static files. To do that, you use a fast language and/or servers written in those languages.
It's something anyone who has done this for any length of time knows, that HN is impressed by this is confusing to some of us. If you were trying to get as little out of your server as possible you'd serve cached content using this framework in this language.
I thought you meant from 10k to 50 rps doing the same work, not that most of the work could be avoided in the first place.
> Is this stuff not being learned?
I don't know if it is. I recently finished my studies, and most people had no curiosity at all. As in, they learned a framework early, used it everywhere, and got a job using it. I do remember reading a few times on tutorial that you should put a Nginx as reverse proxy in front of your Django/Flask/Express server to server static files, so I think most people know/do that but I'm not sure.
On the other hand, having the wisdom of knowing what can be static in the first place? I don't think that it's something teached. In fact this kind of wisdom can be hard to find outside of reading lots of sources frequently in hope of finding little nuggets like that. I don't think I was ever taught explicitely "You should first try to find out if the work you're trying to do is necessary in the first place". In a way it's encoded in YAGNI, but YAGNI isn't universal, and is usually understood at the code level and not the tooling level.
> On the other hand, having the wisdom of knowing what can be static in the first place? I don't think that it's something teached.
All traffic is static by definition. You are not modifying bytes when they are in transit to user. And you don't have to serve different bytes each microsecond just because users want to be "up-to-date". The network latency is usually around 40ms or so. If your website serves 1000s of requests per second, you should be able to cache each response for 10ms, and no one will ever notice (today this is called "micro-caching").
Of course, most webpages can't be cached as whole — they have multiple "dynamic" parts and have to be put together before serving to user. But you can cache each of those parts! This is even simpler if you do client-side rendering (which is why MangaDex abysmal performance is pathetic).
Then there are ETags — arbitrary strings, that can be used as keys for HTTP caching. By encoding information about each "part" into a substring of ETag you can perform server-side rendering and still cache 100% of your site within static web-server, such as Nginx. The backend can be written in absolute hogwash of language such as Js or Python, but the site will run fast because most requests will hit Nginx instead of slow backend. ETags are very powerful, — there is literally no webpage, that can't be handled by a well-made ETag.
Even pages, that need to be tailored to user's IP can be cached. It is tricky, but possible with Nginx alone.
Instead of "static" you are better off thinking in terms of "content-addressable".
> On the other hand, having the wisdom of knowing what can be static in the first place? I don't think that it's something teached.
I think the trick is realising that reaching for a "programming language" is just one of the tools we have to solve a certain problem, and probably the last one we should reach for! For a stable system, you want less moving parts. A good programmer fights for it.
Can you solve a problem just by storing a JSON file somewhere? Can you solve a problem without a backend? Can you solve a frontend problem with just CSS or just HTML? Can you solve a problem without Javascript? Can you solve a data storage problem with just a database instead of database+Redis? Do you really need a full-fledged web framework where a micro-framework would suffice? Do you need micro services, Kubernetes, containers and whatnot for your site before it gets its first visitor?
I find that a lot of people go for the "more powerful" tool just to cover their asses. They don't want surprises in the future, so they just go for something that will cover all bases. But what you actually want is the things with the least power [1].
Another issue is that intelligent people have an anti-superpower called "rationalisation". They can justify every single decision they make, as misguided as it is. So it doesn't matter if a website could be done with a single HTML file: it is always possible to find a reasonable explanation for why it needed k8s, micro-services and four languages.
Software has become slower and less efficient because we have faster hardware to run it on and nobody's asking for the software to get more efficient. Back in the day you had inefficient, expensive hardware with limited software. There was demand to make software as performant as possible so it didn't cost you ten grand to serve a popular website. But now you can load up on hardware for pennies on what it used to cost. 16 cores? Sure! 32 gigs of ram? No sweat. Software can now be bloated and slow and it won't break the bank. On top of that, we now have free CDNs, free TLS certs, even free cloud hosting, and domains are a couple bucks. You can serve traffic for free today that would have cost $50K 15 years ago.
Sometimes there's also a misunderstanding of metrics that leads devs to not think about performance tuning. Like, "4.2M requests a day" is clearly incorrect at this 50rps benchmark. Traffic is not linear, it's bursty. You will never serve 50rps of human traffic steadily for 24 hours. If you're serving 4.2M requests per day, 90% of it will be in a 12 hour window, peaking at whenever people have lunch or get off work, with a short steep climb leading to a longer tail. So to not crash your site at peak visitorship, you realistically need to handle 300+ rps in order to achieve 4.2M requests per day. (But also that's requests per second... if under load it takes 5 seconds to load your site, you can still serve a larger amount of traffic, it's just slower... so a different benchmark is also "how many requests before the server literally falls over")
The best/fastest web servers today manage >600k rps on a single server [0].
django is on that list, near the bottom, at 15k. Clearly his request response is more complicated than a simple fortune.
There are a few factors I've experience here. Interpreted languages have encouraged development patterns that are slow. Ease of allocating memory has tended to promote its overuse. Coding emphasis has been heavily weighted on developer productivity and correctness of code over lean and fast.
I find that poor web server configurations are pretty common. Smaller shops tend to use off the shelf frameworks rather than roll their own systems. Various framework "production" setups often don't include any caching at all. Static files are compressed and then sent on every request instead of preserving a pool of pre-compressed common pages/assets/responses. It's like the framework creators just assume there's going to be a CDN in front of the system, and so they don't even try to make the system fast.
The latest crop of web devs have very little experience setting up production systems correctly. Companies seem more interested in AWS skills than profiling. Then you have architectural pits like microservices. Since there is so little emphasis on individual system performance it seems that it has become or is becoming a lost skill.
Then there is so much money being thrown at successful SaaS that it just doesn't matter that their infrastructure costs are potentially 50x what they actually need. It seems that the only people squeezing performance out of software are the poor blokes who are scrimping by on shoestrings with no VC money in sight.
It's death by a thousand paper cuts. Lots of things that aren't really that slow in isolation, but in aggregate (or under pressure) they slow down the system and become impossible to measure.
Let's do web development. Since you mentioned payloads: today they're bigger, and often come with redundant fields, or sometimes they're not even paginated! This slows down the database I/O, requires more cache space, slows down the serialisation, slows down compression, requires more memory and bandwidth...
And then you also have the number of requests per page. Ten years ago you'd make one request that would serve you all the data in one go, but today each page calls a bunch of endpoints. Each endpoint has to potentially authenticate/authorise, go to the cache, go to the database, and each payload is probably wasteful too, as in the previous paragraph.
About authentication and authorisation: One specific product I worked on had to perform about 20 database queries for each request just for checking the permissions of the user. We changed the authentication to use a JWT-like token and moved the authorisation part to inside each query (adding "where creator_id = ?" to objects). We no longer needed 20 database queries before the real work.
About 15 years ago I would have done "the optimised way" simply because it was much easier. I would have used SQL Views for complex queries. With ORMs it gets a bit harder, and it takes time to convince the team that SQL views are not just a stupid relic of the past.
Libraries are often an issue that goes unnoticed too. I mentioned serialisation above: this was a bottleneck in a Rails app I worked. Some responses were taking 600ms or more to serialise. We changed to fast_jsonapi and it went to sub-20ms times for the same payload that was 600ms. This app already had responses tailored to each request, but imagine if we were dumping the entire records in the payload...
Another common one is also related to SQL: when I was a beginner dev, our on-premises product was very slow in one customer: some things on the interface were taking upwards of 30 seconds. That wasn't happening in tests or in smaller customers. A veteran sat down by my side and explained query plans, and we brought that number down to milliseconds after improving indexing and removing useless joins.
A few weeks ago an intern tried to put a javascript .sort() inside a .filter() and I caught it. Accidentally quadratic (actually it was more like O(n^4)). He tried to defend himself with a "benchmark" and show it wasn't a problem. A co-worker then ran anonymised production data into it and it choked immediately. Now imagine this happening on hundreds of libraries maintained by voluntaries on Github: https://accidentallyquadratic.tumblr.com
All those things are very simple, and you certainly know all of them. They're the bread and butter of our profession, but honestly somewhere along the way it became difficult to measure and change those things. Why that happened is left as an exercise.
> All those things are very simple, and you certainly know all of them.
I wonder about that. Most of the people I graduated with didn't know about complexity. Some had never touched a relational database, most probably didn't know about views. I doubt most of them knew what serialization mean.
I wonder if it’s a matter of background. I never really had tutorials when starting out. I never had good documentation, even.
(Sorry in advance for the rant)
I also remember piercing together my first programs from other people’s code. Whenever I needed an internet forum I’d build one. Actually, all my internet friends, even the ones who didn’t go into programming, were doing web forums and blogs from scratch!
Today people consider that a heresy. “How dare you not use Wordpress”.
My generation just didn’t care, we built everything from scratch because it was a badge of honor to have something made by us. We didn’t care about money, but we ended up with abilities that pay a lot of cash. People who started programming post the 2000s just didn’t do it...
I think it is visible that I sorta resent the folks (both the younger, and the older who arrived late at the scene) constantly telling me I shouldn’t bother “re-inventing the wheel”. Well, guess what: programming is my passion, fuck the people telling me to use Unity to make my game, or Wordpress to do my blog.
I would guess this is close to a default Apache + mod_wsgi setup, and this is one of the easiest way of hosting Python web apps, so basically achievable by anyone on HN.
I assume (based on 180 req/s for static page) that he is using mpm_prefork, where each Apache child handles a single connection. If he switched to mpm_event, which uses event loop like nginx, ~10k rps should easily be achievable, but I don't think WSGI would work with that.
Yeah it's all default apache + mod_wsgi. This is also my first Django setup and I made it over-complicated as a learning exercise.
mpm_event is something I have not heard of before, thanks for bringing that to my attention.
I generally agree with your comment. But the point here is not this is cutting edge performance or anything, but rather that people on HN know how to set up this type of website.
If you have any tips on how to get ~10k rps (or even a more reasonable improvement) on a £4 a month server, I at least would be very interested in hearing about them.
An awful lot of professional programmers work in such heavyweight contexts that they don't have a good idea of how fast modern hardware can be.
I was talking with an architect at a bank whose team was having trouble getting under a 2-second maximum for page views. They blamed it on having to make TCP requests to other services, and said something like "at a couple hundred milliseconds per request, it adds up quickly!" My head nearly exploded at that. I spun up some quick tests in AWS to show exactly how many requests one could make in 2000 ms. I don't have the numbers handy, but the number is very large.
This junky slice of a server handling full page requests in 20 ms is a fine example to counter thinking that's endemic in enterprise spaces.
I had a discussion with a coworker about something about slow memcpys, that roughly went the same way except... you know, DDR4 RAM has a speed of roughly 40GBps.
Also, that "awful" 1MB memcpy is likely all in L3 cache these days. But even if it weren't in cache, we're talking about an operation that takes 50 microseconds (1MB read + 1MB written == 25microseconds + 25 microseconds).
Given that modern CPUs have like 16+ MBs of L3 cache (and more), and some mainstream desktop CPUs have 1MB of L2 cache... its very possible that this memcpy is far faster in practice than you can imagine.
1MB is big, its a million bytes. But CPUs are on the scale of billions, so 1MB is actually rather small by modern standards. Its surprisingly difficult to get intuition correct these days...
My point isn't that it's impressive in some ultra-tuned performance sense. It's that doing pretty mundane things on pretty basic servers is still very fast compare with a) the past, or b) what a lot of developers are used to professionally. That's why it is interesting to the crowd here.