A national radio campaign is likely to bring hundreds or thousands of visitors spread over hours, not hundreds of thousands of visitors spread over seconds, so I probably would not take any particular action to harden a site in anticipation of it. It is a poor use of engineering resources and adds technical risk with no corresponding benefit. (n.b. Pasting code you got from a blog post, particularly code marked as kinda broken, is not a risk-free endeavor! I love nginx, don't get me wrong, but paste in snippets from two different blog posts and watch the sparks fly if you don't understand how nginx handles, e.g., location priority.)
> I love nginx, don't get me wrong, but paste in snippets from two different blog posts and watch the sparks fly
This week where I work I have witnessed much the same with Apache2.
The problem: They have a new Wordpress install, want to redirect an existing site path to Wordpress, and want to ensure that the performance rivals that of the existing site.
Before I could get them to just change the Netscaler rules and add a simple Wordpress add-on that creates static files from dynamic, they had cargo-culted their way to a glorious solution that involved leaving all of the routing in place but reconfiguring Apache using mod_proxy, mod_rewrite, and proxyhtmlurlmap such that requests for static items get munged through the static web server, sent off to the Wordpress server, re-written on the way back, and then served up in-place of the existing file.
They literally have an Apache conf that is the sum of 5 different blogs. And they don't understand how it works, and how it is in other ways broken (they can't login to wp-admin anymore, but hey... it's OK, they can copy the Wordpress database from staging).
This has also taken them 2 weeks to achieve.
When asked why they didn't just change the routing rules, add Varnish and configure Wordpress to know what domain it's on (so they don't have to rewrite anything), they simply shrugged and explained that it's working now.
PS: If anyone wants a technical co-founder in London, or an Architect/CTO level person who can hire good people and build good teams. Please speak to me. I'm not sure I can take the madness of working for other people much longer.
I see WordPress blogs down all the time. You don't need radio campaign to kill it, sometimes a tweet is enough.
Uncached WordPress instance in the article could handle no more than 100k visitors spread over 3 hours (and that's assuming no spikes).
Of course adding such config last minute is a bad idea, but nobody should be setting up uncached WordPress for more than a few friends in the first place.
Not to mention that he's compromising on accuracy. Yeah, 1 second is tiny, but why give it up if you don't have to? Especially those of you reading who want to try this on a potentially more dynamic site.
Edit: downvoted with no explanation, cute. And readers here think they're above Reddit standards.
There's no reason to compromise the integrity of a system to any extent before you have to. As others have demonstrated, that need was clearly not met here.
downvoted with no explanation, cute. And readers here think they're above Reddit standards.
Downvoting with no explanation is arguably acceptable here. There's disagreement about it, but people do it all the time, including me when I don't have time to type in a comment or feel like doing so would add more noise than signal (like, probably, right now). Complaining about being downvoted, though, definitely is not.
Once upon a time, downvotes tended to stop at 0 or -1 for things that were perhaps just mistaken or misguided rather than outright mean/trolls/antisocial.
People should care a lot less about downvotes, and, votes in general. I bump up comments that were clearly downvoted to express disagreement. That's the fix to this problem, not trying to convince 1000 casual users to change the way they use the downvote button.
That's how I generally express my satisfaction with businesses, but sometimes just "never eating here again" is too generous to the restaurant that served you food with hair...
How can explaining why you downvoted, when nobody else has, add more noise than signal, unless your opinion is not worth sharing in the first place? And if your opinion is not worth sharing, why is it worth a downvote?
Let's be honest - we downvote here because we disagree, just like very other site.
That's a good enough question that you've provoked me into figuring out what I think about this.
Adding words takes up space. A downvote doesn't. Therefore a downvote is more lightweight, therefore there are times when it's more appropriate than a comment. It's an in-between gesture, partway between silence and verbalization. I like that subtlety. It's an especially good way to express disagreement with something one feels is not only wrong but also somehow debasing. A comment doesn't have to be rude to be debasing, it can just be mediocre or somehow crass, nor is one always able to say exactly how. Those are cases where fostering more discussion is unlikely to do good and it's better to hold one's tongue. But downvoting still gives a signal to the original commenter as well as to other readers that someone's not ok with what was said. I think that's meaningful, and also that it's fairly rare to see people abusing that signal. It does happen, but more often when someone thinks they don't deserve it, it's kind of obvious to others that really they do. So when it happens, the best way to respond is with a touch of self-honesty. Maybe you'll come out roses but just maybe you'll notice something worth correcting.
One other thing about downvoting. I think what makes it controversial, and also interesting, is that it's an emotional expression. (Upvotes are too, but downvotes are stronger.) That explains why people get upset about it, have strong opinions and so on. But it also explains why it's a good thing to have around. There are very few emotional channels available to us that don't require proximity. It also explains why it's wrong to say that downvotes are inferior and should be replaced by comments. Imagine if people were forced to put every emotional gesture into words. It would be impossibly ponderous. And we'd end up talking about nothing else.
The basic problem is that the OP motivated his decision to do what he did, while you just said that he were wrong without motivating why that second matters so much for him.
OP found a solution that worked for him, and then shared that solution to give other people with a similar problem the option to use the same solution. That is a good thing even if it doesn't apply everywhere. People are not sheep that copy things without reflecting on how it applies to them before implementing.
(Well - perhaps some are, but there are plenty of even more stupid stuff out there for them to copy, so this solution to OP's problem will hardly make this problem worse)
I didn't downvote you cause I don't have the downvote arrow, but I probably would have done so if I had had the ability to do so.
> Not to mention that he's compromising on accuracy. Yeah, 1 second is tiny, but why give it up if you don't have to?
While NGINX may use very little CPU and have minimal hard drive access, PHP (or Java/Python/Ruby/etc.) will hit those resources hard.
Look, I'm not one of those environmentalist wackos who pisses his pants when somebody forgets to turn off a light ... but ... if you're keeping your BLOG server a hundred times busier than it has to be just so that your users don't experience a staleness of 1 second ... you're an asshole. A military or medical application, fine. But a blog? Sorry, no.
It's not about whether he has a point, it's about calling others assholes. In face-to-face conversation there are lots of ways to do that without being strident; in writing it's too heavy-handed. Scrupulously avoiding such aggressiveness is one of the few easy things we can do to prop up the discourse.
> This is fine, up until the point where you get on HN and Reddit at the same time
Incidentally, you don't actually need much to handle that. Our web server is a wimpy 256 MB VPS and we've had (Wordpress) blog entries hit the front page of HN and Reddit simultaneously and weather the storm without missing a beat. An appropriately setup Apache + Wordpress SuperCache does the trick just fine. (Hint: The default Apache configuration isn't "appropriate".) You're not going to hit anywhere even close to 2k requests per second on the front page of those two.
It depends on the content. If it's something that will be visited just by HN/Reddit users: Sure, you're not going to get 2k/sec directly from them. But if it's something that's going to get shared/tweeted/retweeted/blogged/reblogged/etc., then you will be looking at huge traffic numbers. It just depends on how viral your content is and how far it spreads. It's the long tail that matters.
You've really got to hit the jackpot to get up to 2000/request per second (assuming you don't already have a high traffic site). Assuming about 20 requests per pageview, that's more than a quarter million pageviews per hour.
I've had things go moderately viral (thousands of tweets / retweets) and you don't get anywhere near that. Pre-tuning for hitting the jackpot is in most cases going to be premature optimization.
Spikes of 2000/second are not unrealistic, and part of the problem is that once the spike hits, inadequately designed applications start to wedge on lock contentions and just plain inadequate CPU, the VM layer starts swapping, people trying to get to your site start hitting reload (humans do not have an exponential back-off function), and the whole thing comes crashing down until traffic drops off to more sustainable levels.
I'm on a 512 Linode, and my Wordpress would freeze when I got on the front page of HN. I ended up turning Apache's Keep-Alive to off and installed W3 Total Cache, haven't had any problems since then.
Nice technique. The number of distinct dynamic pages you expect to get hammered still must be regenerated within that second. With a longer window, some wp-admin or logged-in-user detection, and a third-party comment service, I could see this being a standard nginx wordpress configuration.
What comment systems are you thinking about? With Disqus the rest of the page shouldn't really matter at all, whether it's dynamic, cached, static etc…
Depending on one's error tolerance, maybe dynamic configuration would help (check if you're hit heavily and/or check if you're on HN/slashdot/reddit, then modify the config accordingly).
I like this solution and am definitely tempted to give it a go.
Anyone got any thoughts on the best way to do this on a page with personalisation? (and this is really simple personalisation - one section of the page changes depending on whether you're logged in or not).
My solution would probably be to have the personalised section load as an async request but then you'd need to make sure that the async request can handle the same load as the microcached content.
You can put any nginx variable into the proxy_cache_key setting, including cookies and query strings. I've used this to cache localized versions of each page on a site based off of a "language" cookie. Async is a good approach for what you need, but it's good to know what nginx's caching mechanism can do.
That's how I've done it in the past. Depending on how many personalized requests you are doing, compared to the resource utilization savings you'll get from caching the rest of your content, you'll probably come out ahead.
Microcaching at the page level is of course a great idea for a dynamic app, but only works if the content being served is identical for all users - in which case why not just use static pages. Oh right, because we only know how to use Wordpress. Ditching wordpress for a static generator should be the preferred route, if possible. (use disqus or similar for comments).
The vast universe of truly dynamic apps that we write in Rails or Python or whatever usually have page elements that are specific to the user's session - "Welcome John Smith" and all that (edit: oh i see he mentioned that at the bottom). So page-level caching isn't feasible there, unless like in the case of disqus you're using javascript to inject personalized content from another server. But for a really interactive web application where coarse grained solutions like this aren't feasible, I'm still a proponent of page-component level caching, something you normally do in your app layer, not the web server layer.
I really like the idea of setting a cookie that bypasses caching for a few seconds - I've heard the same technique used by Facebook, who set a cookie that ties you to the MySQL master server rather than the slave after you perform a write action so that you'll see your update without waiting for replication lag.
It relies on using the Max-Age Cookie argument though, and I was under the impression that IE doesn't implement that correctly. Anyone know what the status of IE and max-age cookies is?
The problem with the expires argument is that you have to set it to a specific date, but I don't think you can be sure what timezone / clock setting the user's browser has - which means that it's pretty much impossible to reliably set a cookie that expires in a couple of seconds time.
Hopefully I'm wrong - I'd love to know if there's a workaround for that issue.
The big issue here is the application type. This only works well for a very particular type of application. That is, a highly dynamic site that is NOT dependent on user logins.
1) If the site is only moderately dynamic, you can just use plain Nginx and set fastcgi_cache to a few minutes or hours. Much less load on the server. I like to keep things simple, I wouldn't even bother with Apache. Porting rewrites to Nginx is super simple.
2) If the site is customised on a per-user basis, 'microcaching' will break the site and have disastrous consequences. Every user will see the system customised for whichever user primed the cache.
My primary website is user based. That means this 'microcaching' concept wont work at all. It would be catastrophic.
That's where Varnish comes in with ESI. I really wish I didn't have to use Varnish. It's slower than Nginx, it adds another layer of complexity, and in testing, it seems slightly flaky. But what Varnish+ESI allows is caching of parts of my page that aren't user specific. I.e. header, footer, etc.
I thought about incorporating Varnish into my side-project as well. However, Varnish seems to be a [potential] hassle when you account for VCL, ESI, and etc. It also adds an extra layer or indirection and complexity as you've mentioned.
With regards to caching user specific parts of your page, have you considered template caching [1, 2]? This way you can apply different caching policies to different parts of your pages w/o using varnish and ESI.
With that said, why would one choose Varnish + ESI (or reverse proxy in other words) over template caching + memcached/redis/riak? Can someone explain the pro's and con's of both approaches?
Yes, we previously used template caching because it was built into Symfony1 (an MVC PHP framework). Symfony2 now leaves caching to the HTTP Standard, plus integrates Akamai's ESI. Symfony specifically recommends Varnish for live projects.
Memcache certainly has its place, but for me, Varnish is a plug-n-play solution that doesn't require code changes. Even with ESI, if you've already set HTTP Cache, you just need to set s-maxage per component. That's it. Then of course there's all the Varnish load balancing features that I haven't used.
If my site didn't use logins, I'd definitely just go for Nginx + fastcgi_cache. People argue that Apache can do the same with the right tuning. But there's something to be said for a 5 minute VPS build by a complete novice that can serve 20,000 requests per second using about 10 lines of config code.
I think this all becomes much more complicated for seriously large websites. That's where memcache/redis et al. join the party, potentially with Varnish.
In my own use and benchmarks I have seen seen Varnish is faster than Nginx for full-page caching. Varnish with ESI will be slower of course. I think ESI is better for Facebook levels of personalization for light personalization ('Welcome, Bob!') using a cookies/javascript is better since it allows you to use full page caching again.
I agree, something seems wrong with that test. Varnish performance should not top out with the exact same performance no matter the concurrency. Here's an older article that shows with some very basic changes you can get 27k/s on an old single core machine: http://kristianlyng.wordpress.com/2009/10/19/high-end-varnis...
It's a totally different architecture. It's adding another layer. Why isn't it conceivable that adding this overhead would reduce speed?
Remember, I'm not comparing a Varnish cache to an un-cached site. I'm comparing cache vs cache. Nginx serves static files very quickly. And cached files, in this case, are static.
Also, these are single-server tests. Everything changes when you start talking about very large websites.
Given that I'm currently personally setting up a caching cluster to host www.melbournecup.com (aka "The Race That Stops a Nation"), being able to gracefully handle a huge spike in traffic is something that is very much on my mind! :)
The site itself is developed in Django and so far I'm just planning on putting a bunch of Varnish caches (behind a load balancer) in front of the Django server. I'm using the very nice Django Varnish app (https://github.com/justquick/django-varnish) in the Django instance to automatically purge pages from the cache as they're updated.
I'm deliberately trying to keep the setup as simple as possible, but the goal is to have a fast site and fresh content.
Tips from others who have handled similar traffic loads would be very welcome!
I've been doing essentially the same thing with my wordpress install. Went from a few hundred reqs/second on a cheap linode to over 4k reqs/second. (I think the limit was the benchmarking tool, not nginx) I've got the nginx config if anyone is interested.
My experience doing something very similar is that the original config would not handle 500 concurrent requests. It's a fair comparison insofar as the uncached app probably couldn't serve that many concurrent requests in a reasonable amount of time.
The only way to make wordpress fast and responsive is to bypass wordpress entirely.
You don't need to do anything complex - just install wp-super-cache, set a long timeout, and most importantly add the .htaccess rules it generates to bypass wordpress entirely and serve the cached static files directly.
The few times I've been on the front page of Slashdot it has eclipsed the traffic that I've had from being on the front page of Reddit. Hacker News barely causes a blip.
"If you have personalized pages (ie: majority logged-in users) this approach isn't going to work. "
I've considered this problem, and am working on a solution for nirvana[1]. The biggest challenge to this project has been to take a language (coffeescript) that is sequential and run it in a distributed environment, without the programmer having to know distributed programming. One of the techniques I'm applying is making a response (in this case, a web page) the result of a collection of components, which are rendered separately in the same context. (EG: The context is the headers of the request, plus the user record if the user is logged in, etc.)
So, the request comes in, the components are loaded from the cache, they are executed (in parallel) all with a copy of the state, their results are aggregated and that result can run thru templating to produce a webpage that is returned.
The idea then becomes, instead of executing the code for every component in every request, if the component has no context specific requirements (e.g.: it is the same for every user, it's a static element, or it's dynamic, and but doesn't need to be generated every time) .. then it can be flagged as cacheable. The caching would also have a staleness factor (Eg: 1m, 5m 10m).[2]
My hope is that you can have pages that are custom per user, but that also contain heavy impact results (say a graph produced by an expensive operation), where the results come form cache, the static components come from cache, but the user specific parts are dynamically generated each request.
This component approach not only lets the code be rendered in parallel, and often not even rendered, but instead pulled from cache, but it should allow for more convenient re-use of common elements and features across a site.
I hadn't considered caching for just 1s, though. Will have to think about that.
[1] Nirvana is CoffeeScript web development backed by erlang and Riak. Instantly distributed coffeescript. It will be open source, hopefully soon. Follow @nirvanacore on Twitter if you're interested in being notified.
[2] Planned. There are some implications of this that will require tradeoffs, so initially it may just be a flag of Yes/No for "Cache for up to 1 minute." or some value like that.
Varnish supports ESI. What you've described is ESI.
Just put Varnish in front of your servers and used the Expires headers of the responses to cache stuff accordingly. If you include user information in the cache hash of the personalised chunks of the page (i.e. in the URL or cookie) then you can cache that for a short time too.
I do apply short term caching to a lot of things, mostly to protect against double requests for the same thing when there might be an expensive query behind it.
Everything has had SSI. In fact the way that the BBC used to make their web site in PERL on Apache used SSI for their "Most Read" section.
ESI is different... Edge Side rather than Server Side.
The benefit is that you don't need one server capable of doing everything. You can have a separate web service capable of returning the "Most Read" section on different technology, different machines, elsewhere on the network... and ESI will interleave the output of these web services at the edge of the network.
Varnish supports ESI, and as one of the main reasons to use Varnish is caching you can reasonably expect (and be correct) that Varnish will cache ESI sections.
So Varnish will receive a document, cache the whole of it, and use an entirely different cache policy on a section of it.
It's all swings and roundabouts. I just have a strong preference towards not making any part of the solution complex by trying to make it do anything more than it need do, especially when other layers can be added that do that specific thing (caching in this instance) better.
Thanks for the pointer. There's two advantages to what I'm doing over using Varnish in this way:
1. The component writer knows the needs of the component and can set the caching policy for it there. Other developers can then just include the component without knowing what its needs are. In a way, this means they don't have to design the page for caching.. they kinda get it for free.
2. There are several optimizations I can make using this method, and since it is intrinsic to the need to be able to run things in parallel (components have a clearly defined line between them another components) the caching ability is just a few extra lines of code and so relatively low cost for me to add.
Thanks, reading that now. Riak has a pipeline service called riak_pipe. One of the things I've been thinking about is having applications define a pipe, and then sending requests thru the pipe. The pipe is inherently sequential, so it fits well with a sequential language like javascript (you'd just write fittings in the coffeescript/javascript and then at the app level define the pipeline as a series of fittings.) On the other hand, this seems potentially less parallelizable and so I won't know which is more performant until I can run some tests (And I'm not yet to the point where I can do that.)
Anyway, reading the page you linked to now, will update this comment with what I learn from it.
EDIT TO ADD:
Have read a good chunk of that page now. THANK YOU! It's very interesting. I'd considered doing something like this already, but had put it off for later. At this point I'm working on just the server side, though I've saved the article and will check it out again when I can had some higher level functionality to nirvana. Thanks!
Something I have done in the past is put all the "loaded by everyone but not static" stuff in memcached. The application pushes to memcached on startup, and any POST that would change it causes the application backend to update the memcached entry(ies). This is a slightly different approach from the microcaching, but it has the advantage of being always consistent.
I only do this when the stuff starts to obviously become a bottleneck, and I haven't done much with cloud hosting, where RAM seems to be more expensive than cpu cycles, so I don't know how well it would work there.
Effectively, a Riak in-memory database is memcached, but distributed over all the nodes, with zero administration to grow or shrink it. So, if I've got 10 nodes, and each has 30MB set aside for this caching (or whatever) effectively its a 300MB cache. Adding another node makes it a 330MB cache.
I'm avoiding the logic of deciding whether a POST would change it or not, by letting the components handle it. They can set a default TTL, and they can also, if they process any relevant data, force an update to the cache.
I'm doing client side includes for personalised content. So, the main page is served with no personalised content and with all the heavy caching that this allows, and the html has place holders for the personalised content. Then, upon page load, an ajax request is fired that loads the user-specific content and injects it to the page. It works quite well and since this is specifically personalised content, I'm not too afraid of having to support non-js clients (Such as crawlers etc.).
>Nirvana is CoffeeScript web development backed by erlang and Riak.
OH COME ON, I would almost bet money @nirvanacore follows the great traditions set by @rolfscale and @hipsterhacker.
That said!, good luck :D.
>The idea then becomes, instead of executing the code for every component in every request, if the component has no context specific requirements (e.g.: it is the same for every user, it's a static element, or it's dynamic, and but doesn't need to be generated every time) .. then it can be flagged as cacheable.
How are you going to win over everyone else's caching solution? Why are you going to be easier than
I'd love to find a place on the internet where I can talk about my project seriously with people. I'm the only engineer in our small startup, and building this platform to decrease the cost of development for myself going forward. I'm open sourcing it because there isn't a solution like this out here (or I'd just use that instead.) My focus is actually on zero-operations-dudes-on-payroll scalability up to medium sized services, not "it's gonna be huge man" scalability. I'm not using MongoDB because its "wicked fast!!!", I'm using Riak because "no ops guys needed, just add nodes."
Seriously, anyone know such a place? I guess I'll have one when I make the first release, since I'll make a mailing list and issue tracker, etc. But right now, before the first release, when I'm thinking about serious architectural issues, that's the time I could use someone off of whom I could bounce ideas and get feedback, positive or negative. I mean, constructive feedback.
That's why I made the comment on this post, because it's reasonably relevant to what I'm doing.
In answer to your question, I have no interest in winning over anyone else's caching solution. I'm not building a web cache, I'm building a web development platform. This makes web development a whole lot easier for the kind of development I do. I expect it will be useful for others, but even if it isn't, the time invested in it is almost nearly recouped developing our MVP.
Oh stop whining. My jest at the top of my post obscured my seriousness at the bottom.
My point largely was, "I can't tell if you're being satirical or not". You're meshing three extremely new and very fancy technology stacks together and inventing a whole new framework to go with it.
So… maybe you do want to figure out a good stock way to counter my skepticism. Waving your hands and saying "it's a web development platform!" feels unsatisfying :).
APC on its own is an opcode cache, not a page/data cache. Did you write your own code to save pages into it and retrieve them? Or is there a cache plugin for WordPress you're using which uses APC as its data store?