Hacker News new | past | comments | ask | show | jobs | submit login
Why I Don’t Host My Own Blog Anymore (kalzumeus.com)
240 points by llambda on Feb 9, 2012 | hide | past | favorite | 111 comments



I don't know, Patrick, it really seems like a lot of pain and suffering you went though for nothing.

Perhaps if you added your actual volume numbers to your entry it would help the conversation? I have several sites on various platforms, and I'm currently running over 160K visits per month across all of them. It's not unusual to have 10K visitors from HN drop by on any given day. I haven't had any problem with WP. Of course, I don't really check it. It just works.

I hate to sound non-caring, but is it really a problem if ten people come by your blog at 3am and can't load the site up? They'll be back at 4am, and the site will be fine then. Do you really need 99.999 uptime on a blog? It's all a numbers game. If you lose 2% of visitors due to technical problems it shouldn't be a huge deal. Just take a look at your funnel and conversion numbers and do the math. There's no way that damage is going to come out to $2500 per year.

On a personal note, what I find is that there are a huge number of opportunities for my startups to accrue costs that don't justify the benefit. It's tough figuring out what to spend money on. Things like hosting or tools have to pull their own weight.

ADD: There are a lot of companies that made huge amounts of money leveraging free blogging services. Not my cup of tea, but it does provide a bit of context.


3 AM Japan time is, um, what time is it for you now? It's then. Right in the middle of the US work day. Which is typically when Jimmy Wales decides to get me mentioned in the NYT. That was mostly not business-oriented, but in general, that sort of thing is very worthwhile.

I don't really have a conversion funnel for the blog at present. (That will probably change in the future, which is one of the reasons I knew I had to switch.) It's just my main source of consulting leads and a) my rates justify $2.5k pretty easily and b) consulting clients start getting all kinds of skeptical if their SEO guy isn't on the Googles because his server went down.

(I'm not the only person with this problem. You should have seen Fog Creek when their blog decided to die. By freak accident I was in the office when it happened and was able to successfully diagnose the problem from "Wordpress just crashed": it was Apache KeepAlive because it's always Apache KeepAlive, grumblegrumblegrumble.)


So this is really more of an image thing -- site responsiveness, accessibility, and so forth. Similar to how a shopkeeper might have paid 2500 each year to have a sign custom-painted out front.

In that case, makes sense. I'm not an extreme cheapo -- I'm probably paying 1800 a year for my various hosting solutions, but that's spread across HostGator, a MT hosting company, and AWS. I'm moving to AWS -- exactly in the opposite direction that you are heading -- because I like buying cycles. I can set up databases, write code, and have more power to write hybrid sites. I'm beginning to look at producing content on the web, blogs, apps, static sites, or mixed-mode stuff, as more of an art form and less of a pre-canned WP experience. So I need a box, not a service.

But I look at this as simple content. I'm serving bits to folks who come and consume them. I don't generate consulting leads though my sites, I provide data people want for free, so as long as most of the people get most of the data, everything is copasetic on my end. It sounds like we have completely different purposes.

Just as a off-the-wall suggestion to other readers, you know you could write a script to consume your blog entries and move them to a CDN as static pages. You'd get a hell of a boost in performance, and I don't think you'd really have to change much. (Probably much tougher than I make it out to be. Haven't really thought it through. I use URL rewriting on some of my newer sites, so it all looks like static content anyway.)


How is that an 'image thing'? He's a professional programmer and SEO consultant. It's not an 'image thing'. It's a basic 'this is my job, why isn't it working' thing. That it's still such a pita in this day and age that you need professional keeper-uppers because it's so opaque and complicated that web programmers have no real clue what we're supposed to do is all a bit odd.

Taking your analogy it's is more akin to the tills in the store keep crashing in the middle of the lunchtime rush.

You can walk around and tell you customers it's the till software suppliers, but in the end it just reflects badly on you. Or you can follow your suggestion and get out a pad, a pen and a calculator and write out their bill. Neither is a good solution.


You should only be creating custom solutions if this (blogging for instance) is your primary business. If you want to kick ass as a one-person team you have to know your limits and only spend time of things which are critical.

Nice suggestion re CDN.


You should only be creating custom solutions if this (blogging for instance) is your primary business

Even more specifically, only if existing tools do not meet or cannot be adapted to your needs!


I disagree. If something is core to your business you cannot outsource it.

Google couldn't use someone else's search engine

Microsoft couldn't use an outside kernel or filesystem for windows

Gridspy has to make its own dashboards. However, We can use handy libraries for html templating and graph rendering.


But a blogger (the given example) produces content. A blogging platform is core to their business, but it isn't what brings value. In Google's case, the search engine is what brings value.

Don't let yourself get caught up in NIH syndrome.


But look at it another way : Maybe your hands-on knowledge of KeepAlive is the kind of thing that signals a client that you're not just a regular SEO guy...


I think this could be a great route to take. Patrick might be at the point that he has enough marketing work to do though, that he wants to focus on that.


Many of us have $1000/day+ contracting rates. Beter to spend the extra week contracting per year rather than upgrading plugins and securing servers.

I solved the problem by using movable type which doesn't serve dynamic pages. Still heaps of maintence though.

Course: I serve no traffic.

Http://blog.gridspy.co.nz/


Our blog is hosted as part of our general website, and it lives in 4x servers at AWS. Did you ever consider load balancing across 2x512 instances rather than 1x1024mb server? We considered options like outsourcing the hosting of elements of the sales site, however its just part of our standard build and in reality i don't see any major issues with load balancing a WordPress blog.

Your main bottle neck once your WordPress blog is under load will become your DB, but if that's your problem you're probably doing something else very right (so congrats?)


The first thing I thought of when I saw this was: http://news.ycombinator.com/item?id=3452477 where about a month ago you mentioned you were on-site doing work for WPEngine.

In the interest of full disclosure, this really needs an up-front disclaimer; the only mention I could find was in the "P.S." in the last paragraph [edit: also mentioned halfway through]. Actually, the article reads like classic long-form sales copy so perhaps people will be automatically conditioned to scroll to the bottom to look for the "P.S." line. :)


He did disclaim his involvement, and indeed, his in-the-middle mention of doing client work for them came exactly as he started talking about why WPEngine was so great. You literally can't get to the part where he's praising WPEngine without either finding out he's worked for them or skipping part of the article (in which case, you don't get to complain about him missing something).


He only mentions doing marketing work for them after his copy discusses several of people's pain points and waxes poetic for a number of paragraphs about their CEO's technical prowess when it comes to WordPress hosting.

Imagine going to a technical conference on petroleum and a politician is talking about the problems of safely distributing oil to industrial consumers and how the CEO of a company that installs long-distance pipelines is wicked smart and has figured out all the challenges... before disclosing that he does marketing on the side for that very company and says that everyone should use them for their oil distribution. Maybe that company is truly awesome and deserved the praise but it'd seem fishy, no?


I hate hosting Wordpress.

I don't host it for myself. Over the years I've collected a small blog network. I'm proud of the blogs I host, they're the some of the best and most influential in the Australian blogging world.

But by god do I hate hosting Wordpress. If I made more I'd push the whole thing onto WPEngine and be done with it.

Nowadays I have it running smoothly. The performance keys are:

1. Exorcise Apache. You don't need it. Nginx + PHP-FPM is the correct option.

2. Caching, caching, caching. Enable the MySQL query cache. Install WP-Supercache or W3 Total Cache. Have them spit out gzipped HTML to disk and tell Nginx to serve it directly.

3. MySQL should be on a different server. Annoying but it causes a dramatic improvement.

There are other tweaks I haven't applied such as serving static content from a cookieless subdomain, pushing stuff onto a CDN. If I get to the CDN stage it's time just to turn it over to WPEngine.

It's now sufficiently fast. During the Brisbane floods last year, one of the sites got > 100,000 visits in a few hours and I didn't even notice a slowdown (in fact I didn't notice the visits until a month later when I logged into Google Analytics to check something else). I am now preposterously over provisioned but at least I don't have to think about it any more.


Yes to all of this.

One addition, if you don't care about comments, or use something like disqus, you can put varnish in front of WP and it will just work. A simpler solution, don't use wordpress, use some kind of static blog generator. It's really easy to host static files!


My users wouldn't stand for Disqus and I think the pitchforks would come out if I started babbling about command-line interface static site compilers.


of course, it's not the right fit for everyone, but there sure is a lot of upside to it which needs to be seriously considered. Hosting WP is a giant pain in the ass.


Agreed on this setup, I have a blog that does 500k+ visits a month on 2x512 Linode VPS and it barely goes above 0.1 load.


Wow, Even on WP-Engine the blog takes 3.4 seconds to load. That's terrible.

I had similar nightmares to you for a long time with Apache/PHP/WP, then finally put Varnish cache in front of the whole thing. Every single page loads in sub 1 second, and even a massive traffic spike results in CPU load under 0.2.


It will load in comfortably under a second if you already have the 250kb header image cached. I losslessly compressed it but apparently WooThemes went for "pretty" over "lightweight" there. It wasn't worth enough for me to call in a designer to redo.


Just a small suggestion in that case (because it makes it so painful on mobile devices).

Little dragon logo (maybe 15k?) and a CSS3 gradient? http://www.colorzilla.com/gradient-editor/

If your site was otherwise graphics-heavy I wouldn't say anything, but it is a really sparse/clean layout already.


You make it sound like this is easy. WP (and associated plugins) make extensive use of cookies, and by design, Varnish will not cache pages with cookies.

For me, dealing with problem was pretty painful and took me much longer than I thought it would to get it right. VCL also has a bit of a learning curve, although I did find it rewarding and was glad I did it.


I'd argue any content software that doesn't automatically provide a complete page cache is incomplete.


Shouldn't the title be "Why I don't host my own Wordpress blog anymore"? Seems like most of the complaints were about the finickiness of WP. Why not host something like a simple static blog, with Disqus for comments?


Switching from a car to a bicycle will indeed decrease your gas consumption but that does not turn bicycles into acceptable substitutes for cars, even if you jury-rig a horn on them.

WordPress is, far and away, the best blog software (and particularly software ecosystem) out there. I wish there was a Railsy alternative a fifth as good, but there isn't.


What parts of Wordpress do you find superior to alternatives?

I'm building yet another blogging platform (not quite) right now. It isn't going to be Wordpress, but it will be managed and won't cost you $200/mo. A carcycle, if you will.

To carry the analogy to its end: you seem to be only using the car to drive ten blocks to work. In other words, you only serve static content + comments. Which is why I'm curious what, specifically, compels you to stick with Wordpress even though there is an obvious cost (be it money or effort.)


The biggest one is the ecosystem. Can I get a WooThemes-caliber theme done for your CMS at 3 AM in the morning for $70 without talking to anyone? And will it work virtually instantly as soon as I drop it over? And will that play well with e.g. using the blog as a lightweight CMS, with custom menus, static pages, and whatnot in addition to the blog proper? And will that work with the plugin ecosystem when I want to do just-a-wee-bit-trickier things like the What Would Seth Godin Do prompt?


> Can I get a WooThemes-caliber theme done for your CMS at 3 AM in the morning for $70 without talking to anyone?

I know you run a business, so for you the answer may be 'yes', but for most people running a blog: is this really a huge concern?

To borrow an upthread analogy, if I'm switching from a car to a bike, whether I'll be able to get a complete paintjob at 3am for $70 without talking to someone is not really something that's going to make the pros/cons list.


I think compatibility WooThemes & friends is, in fact, a huge concern for the majority of bloggers.


Perhaps not those with traffic problems. Hopefully by then you've found a way to direct that traffic to useful revenue or have enough cash from your business to invest.

In OP's case, he's spending money on hosting a CPU-intensive blogging engine. In theory he could spend the same money on theming an efficient one.


But what does it do that makes it apparently uncacheable? Your pages change when you write a new blog entry, or when someone posts a comment - and even then there's no hard constraint that requires their comment appear immediately - serve a stale-by-60-second page and I doubt most people would notice, never mind care. If your blog engine can't serve one page every sixty seconds to the nginx/vanish/other reverse caching proxy in front of it, just what is it doing?

(This is partly a genuine question and partly a rant caused by seeing my VPS go into swap death spiral when the googlebot came around, just because WP apparently couldn't serve more than about ten simultaneous requests. Intellectual curiosity says I'd like to know, but life really is too short to go back to having to deal with it fo rreal)


Of course it depends on what you need, but wordpress is not even close to the best blog software if you need standard blog features and a site that actually runs without the kind of maintenance problems described.

If you want a huge plugin ecosystem and easy theming then yes, it is the best. You just better get caching right because it's going to hit your database 100+ times per page load once you install those themes and plugins.


" wordpress is not even close to the best blog software "

So - what is closer to the best blog software? Better yet, what is the best blog software?


The forum and cms options in rails are fairly abysmal right now. I've worked with refinerycms a bit, and it seems to have the most potential so far. No rails forum packages can compete with phpbb, though.


>> I wish there was a Railsy alternative a fifth as good

Radiant CMS http://radiantcms.org/ . I don't even know Ruby well and I find it easy hack on, add extensions, and take features away that would get in the way of the client using it.


Railsy? Why does blogging software need to know anything about HTTP requests?


I meant that if there were a WP-caliber Rails blogging system (and community), I'd happily switch to that, because I'm very well-versed with how to run Rails sites and because that would make it much more hackable for me. (I'm theoretically capable of writing PHP code, but practically have "Don't do it, Patrick, this is why you are best friends with a PHP developer" written in indelible ink on my wrist.)


Unless you expect to be in the blog software business this is the first step down a dark path into the abyss.

It always starts as a "Hey, I can just have this static page". Then, "...well I just need this simple feature, and I've been wanting to learn node.js anyways..."


But what simple features are these? Would you please list a few?

The historical problem with "blogging" is that it was invented in an era where we didn't yet have Disqus, Flickr, Instagram, Facebook, Twitter, Feedburner, YouTube, Slideshare, Google Analytics, or even Digg, Reddit, or Hacker News. If you wanted any portion of the functionality of any of these things you had to build it into your "blog". And so hacker blogs were enormous code-heavy things that tended to get heavier with time.

But nowadays I'd argue that, to first order, any work you do to reinvent any of the functionality of any of these things is an expensive luxury. Others have concluded the same, which is why so many blogs have shed so many features and gone back to being: A bunch of static web pages full of writing. With maybe a sitemap and maybe an RSS feed, and maybe comments with Disqus, but maybe not.

But I'm always curious to know what I might be missing out on if I switch to Octopress, so please: A list? Sell me on Wordpress!


You make a valid point. I gave up on rstblog because I wanted a simple feature - like not treating every file in a folder as a potential blog post (e.g., trying to include an image) - and after glancing at the source code, quietly closed my laptop and reconsidered whether the world needed to hear my thoughts in a blogable format.


You don't really have to balloon it if you want just a blog. Over the years my blog-generation script has slightly expanded, but it still clocks in at one Perl file of 639 lines, a lot of which is boilerplate. It generates HTML, an index, and an Atom feed, with a few options for the index, which is about all I need.


I thought Apache had died a long time ago? I haven't used it in a production environment in years. Nginx + php-fpm for WordPress. Hell, you can use any web server you want with some kind of decent caching plugin like wp-super-cache and you'd be able to host this on a $6/mo GoDaddy account.

WordPress is SO easy to make work, I am totally shocked you're spending $2,500 a year to host your site. What a tremendous waste of money.


Then you're living in a bubble. Apache still has the vast majority of web servers and sites and it doesn't look like it's going to be taken over any time soon:

http://news.netcraft.com/archives/2012/02/07/february-2012-w...

Not only that, but in Apache 2.4, the Event MPM is no longer experimental, which addresses Nginx's only advantage (performance).

I recently moved my site from Apache to Nginx and like both. But you're living in a dream World if you think Apache is anything other than a brilliant web server, and the most widely used and successful web server that has ever existed, by far.


Are you being sarcastic? Apache still powers a huge chunk of the Web http://en.wikipedia.org/wiki/Web_server#Market_share


"It’s transparently obvious that PHP can scale — one look at who uses it (Facebook is really all you need to know) proves that."

If by scale you mean code your own extensions at the C level including your own VM, then sure, it scales!


PHP can scale, however i think what we've seen historically is that the programmers who have written most of the PHP apps out there don't know how to write a scalable app.


Facebook began using HipHop (their PHP compiler) at a time when they already had 200 million users and a website that by definition is very hard to cache. So in other words, PHP without Facebook's custom compiler was able to handle a site having 200 million frequently returning users. I'd say that's pretty good.

Sources: http://www.h-online.com/open/news/item/HipHop-Facebook-s-PHP... and http://en.wikipedia.org/wiki/Facebook


But it wasn't all running on a single VPS.

If OP had thrown more hardware (and hardware costs) at the equation his problems would have reduced.


This is not a very useful response. If you enjoy the advantages of interpreted languages, accept the consequences. If you don't, well, good luck with websites implemented in compiled languages.


Great post Patrick. I know you'd had a series of bad experiences with both Apache and self-hosting WP. This article makes a pretty compelling case to give WPEngine a closer look.

  I ... was not asked to write this by WPEngine (who are,
  again, clients of mine), and would not have written it
  except that their service really rocks.
This statement feels a bit disingenuous to me. It seems likely that they didn't ask you to write it simply because you offered: "How about I write up a mini case study/sales pitch on my popular blog. You win because I'll send you business, and I win because then future clients will know I did work for you and indirectly advertises my services".

I'm totally cool with that, btw, it just seems a bit like lying by telling the literal truth... which is something that rubs me the wrong way.


He comes out and says they hired him for marketing...


You can't question someone's integrity without bringing a good deal more than this.


It seems crazy to me to put so much effort into a blog, when you can get very very very close to 100% uptime by just using jekyll + amazon s3 + disqus.

That's what I do with abtinforouzandeh.com and it has been able to handle 50k+ requests over the span of a few minutes while maintaining sub-second load times.


Yeah, until your time is so valuable that you'd rather pay your assistant and/or a copywriter to help you with it.

That's where I am with my company right now. Even with my copywriting background, my time is valuable enough that I need to hire someone to help us with website content. WordPress is easy. I can hire thousands of people to update my design or add content. Not so much with another solution that doesn't have a WYSIWYG, web-based interface for adding content.


My setup is as simple as creating a new text file with the content, using basic markdown syntax to create the post, adding it to git (totally optional), and running a brain dead script to deploy. I could probably teach my mom to post content to my blog in one 30 minute lesson.

This setup could easily scale with the org so that you have tons of people writing content for you. Let them write a post, push it to github, and have some gatekeeper in charge of posting to the site.


Have you tried using Cloud9 IDE as a means to modify the content. In the case of static site generators like Jekyll, one could just host the content files on github and use Cloud9 or a similar web-based editor to edit the content.

You can also setup post-hooks in Github such that a push will automatically re-generate the static site for you and update the site. That is how I have setup my personal blog.


how much faster would the page load times be for s3 versus nginx on a linode vps?


I treat my personal projects as practice,

My blog has at a time had a peak of around 100k visits a day, its a static site served by nginx on a linode512 that has various other personal project running on it, its never blinked.

I recently put a forum live(coded from scratch, but not all by myself) that is hovering around 5k visits a day, this has had growing pains (currently on a linode 1024), but when debugging the performance issues I learned a hell of a lot, and its been stable since (touch wood).

If I wasnt able to keep a blog / forum / various other side projects live without having continuous maintenance problems then I would be pretty worried seeing as that is my job.


The VPS that I used to run my blog on cost a bit over $100 a month (1 GB Rackspace slice + 160 GB of bandwidth

Are hosting services much more expensive in the USA than in the EU? I pay 25 euro for a colocated 1U slot, 100MB/s guaranteed but always seen at top speed, 500GB total traffic/mo, and I can put in there what I want, as long as it doesn't require more than 60W of power.

So I bought myself a nice 'green' 1U server for 700 euro or something, and happily run nginx/cherrypy and apache/mysql/php on it.

The same provider I rent this from also offers VPSses, but I found them underpowered, cramped, and expensive. I guess it's worthwhile to stash your own hardware into a data center ;-)


> Are hosting services much more expensive in the USA than in the EU?

For servers/colocation, in general, yes, the services are more expensive in the US. That's been my experience.


While they may be more expensive, you can easily find a dedicated server for $100 that is going to be much better then a $100 VPS.


WPEngine seems great and I was thinking of putting the blog for my new startup on there. However, since I want it live in a sub-directory rather than a sub-domain the traffic still needs to go through my own server which effectively eliminates a lot of the benefits.

If anyone knows a work-around please let me know since it would be great to have it on WPEngine.


I get worried by blanket statements such as "leaving KeepAlive on will probably not be in your best interests." Using Keep-Alive is generally a good thing in terms of web performance. Not having to establish a new HTTP connection for each new asset request is gold.

The problem is not with Keep-Alive per se, it's more about Apache's process model (assuming you're using mod_php) - the fact that you have a large (10-100 MB) process sitting there doing nothing. My advice would be to either use Apache with mod_fcgid (and php-fpm) or jump to nginx + php-fpm.

We host a number of WordPress sites using Nginx and the stability has improved massively.


"I love being out of the hosting business."

I can totally understand that feeling. I was in a similar situation a year and a half ago and not having to fear incoming SMS (hello your server is down...) is a bliss.


I gotta say, the setup I'm using for WP doesn't seem like that much of a pain in the arse.

* Bog standard LAMP setup

* Wordpress

* Cloudflare (CDN, security, free)

* WebsiteDefender (monitors WP plugins and general security, free)

About the only cost out of pocket for me is the hosting that the system is running on, which is being used for a great many things besides my blog.

There probably is something to be said for $90 a month to have some of the headaches removed (which I'm sure increase with scale), but I'd be willing to bet finding a good mix of free services would get you most of the way there.


A blog is 99+% anonymous users with a comment every now and then. Your database shouldn't receive more than 20-30 requests every minute and varnish / nginx should take care of all of the traffic.

I never understood why people had problems with traffic on blogs :-/


Because they aren't caching.


And because they're using Apache where, due to a combination of the still-common forking process model and KeepAlive, they get exciting OOM events if 20 people click on a link simultaneously.


The moral of the story here is that the author is not a great server admin. WordPress.com was working fine and fast for him because of their engineers.

I love how people blame WordPress for their troubles when it's not.


I wish we saw some specifics of those tweaks-to-the-limit. This post is vague and advertisey IMO. It reads to me as "I might be throwing money out of the window, but I'm fed up with dealing with a critical part of my online presence."

Shouldn't switching to Nginx be the very first step? Enabling gzip the second, copying assets to a CDN the third? 1024 Mb is a plenty of RAM to keep every post in memory, too.

If you have the resources to outsource non-critical parts of your business, good for you.


I thought I was fairly explicit: I previously had nginx proxying to Apache2, using WPSuperCache and a few rewriting rules to serve all static content and the vast majority of pageviews without hitting Apache. Cache headers were set and quite generous, static content was split over multiple subdomains (and served directly by Nginx). I had manually edited the WordPress theme to fix inefficencies with how they bundled CSS and Javascript. Gzip on, images crunched when I remembered to do so, yadda yadda yadda. (A lot of this work had to be repeated every time I switched themes, so I'm not entirely sure it was all up on the most recent one, but I've done it three times at least.)

Trust me: I put probably high four figures of engineering time into solving this problem "for free."


I believe you put the time in and I would never doubt your (or anybody else's) capability to optimize a site.

So it was really just me reading something between the lines that is not there, or the post sounded like "the site was down when I published something... it kept on for several months... then I installed this... it was still slow... then I installed that...".

There's absolutely nothing wrong with outsourcing your blog. It just didn't seem justified.


There's absolutely nothing wrong with outsourcing your blog. It just didn't seem justified.

In what way? I thought Patrick clearly explained that maintaining it himself had become a time sink. Hence, he hired competent people to do it for him.


Why use apache at all? Why not just serve it all with nginx?


Long story. My blog is the most trafficked site I run (well, exclusive of BCC) but I have about 15 sites on that server. One of them has a hard dependency for Apache. Moving Apache to 8080 and putting Nginx on 80 got me going with only about an hour of configuration tweaking, but doing the blog totally with Nginx would have required spinning up a new server entirely.


I checked the pricing page and the rates seemed steep. What level plan do you have, Patrick? It looks like the pro account runs about $1,200 a year, but caps at 100,000 visitors. Is that more than you see in a good month of blogging?


I am grandfathered on their old pricing, which is $200 for 250k IIRC. My blog occasionally sees spikes of 100k hits in a day. More typically it is 20-50k on a new post day and about 2k to 3k on any given day.


Oh, so that's 250,000 visitors per say, not month?


Nah, it is a soft-capped 250k per month. That would accommodate I think every month but maybe two in my history, and I assume they'd just ignore the overage for those two, based on conversations I had with them. (If I start routinely exceeding 250k I'll just step up to the next plan level. Money is not a limiting reagent for me at the moment, well, not below the level of "average price of a wedding.")


A couple things strike me about the rebuttals here:

The illusion of simplicity

No one is saying this specifically, but the spirit of what is being said follows the line of thinking that because it's easy to get a server stack set up and running, it's a "simple task"". Getting a stack running is easy. Making it scale beyond a certain threshold is not simple. That threshold tends to be a brick wall in my experience.

Underestimation of time spent

It's easy to sink a lot of time in to these administrative tasks. You have to remember what business you're in. If you're a sysadmin, great. This is time well spent, but most of us are not sysadmins. We're in some other business. It is good to understand how a web hosting stack works from a high level. It helps you make good decisions. One of those "good decisions" is recognizing where your time is best spent. If your business is anything other than system administration, I doubt that relentlessly benchmarking your stack to see how many reqs/s you can host is a good use of your time.

The assumption that "I" know a solution

The only people with enough knowledge to challenge Patrick's decisions are those who have met this challenge. My WP hosting works fine, but that's because 2,500 visitors a day is a big day for me. I wouldn't even begin to suggest that he didn't make a sound decision until I had met this challenge myself.

I have some small idea of how difficult it must be, because our company's software does "real time" online procurement events (reverse auctions). Sounds like boring stuff, right? Consider one of the basic tenants of scaling: serve from cache and avoid re-building on each request. Now, consider that in a reverse auction, you might have 10 bidders placing bids on 500 line items spread out over 8 lots. Bid rates regularly top 30 bids per minute. Each bidder has a unique object cache because of the nature of reverse auctions (they each have their own data sets to track). Sometimes we implement weighting, where the bids are multiplied by a factor, have some fixed value added to them, or are calculated as a markup/markdown percentage against a fixed value. Then there's the observer screen, which has views for overview, line item summary, and line item detail, each of which have their own view states. Every time someone places a bid, there are a number of objects that are completely invalidated and must be rebuilt. Everyone involved (observers and bidders) in the live event is polling the server at the fastest rate we can afford.

As much as I appreciate input, I'm not asking for solutions to our scaling challenges. We have plenty of good ideas, and we've stayed ahead of the curve (mostly) on what could very generously be called a shoestring budget. Even having done that, I can't question Patrick's decision here. If I suspect anything, it's that he made a wise choice. Scaling is hard.

My point is that you'll never know if you can meet a challenge until you do it yourself. If you have met the challenge, great, but at what cost? Consider your own bias and take a serious look at the time investment. If you still think you've got all the answers and you can do it in trivial time, please send me your business card, because I'll hire you on the spot.


Your answer is redis. But I don't have business cards.


I laughed, but this is exactly how many of the answers here sound to me.

Answers like this (even though I'm assuming yours was sarcastic) exemplify everything that I outlined above.

The "you" stated below is in the general form, not you specifically. It is the "you" intended to represent people who would seriously respond to a question like this.

You don't even know if a database is our bottleneck (it's not), you don't know what database/datastore we're using (it might be redis), you represent no experience in our problem domain, you have no detailed information about our current performance, but you have no hesitation about positing an answer.

Sorry to be so blunt, but uh... "Get off my lawn, you damn kids!" :)


Just to get on your lawn for a moment.

For Gridspy, we store data into our server from sensors and allow customers to watch a 1-second live data stream.

http://your.gridspy.co.nz/demo/

The key from us was to switch from a "polling" model to a "push" model. We don't poll the database, our backend pushes the new data through the stack. In our case I plan to actively rebuild assets and store them to disk rather than rebuild and cache on demand. But you can see the live updates on the site now.

More details (a bit dated now): http://blog.gridspy.co.nz/2009/10/realtime-data-from-sensors...

You might consider an approach such as http://socket.io/ and lots of logic in the page as a great way to avoid the "everyone polling the backend as fast as they can" model.

By the way, your lawn is very nice. Well maintained. Perhaps you need a zen garden.


I appreciate the thoughtful response :) Polling actually works really well for us though. It's like rudimentary time-division multiplexing for sockets. Push means you have to keep an open socket to clients, which has its own set of scaling problems.

Our challenges are related to cache design complexity, fast cache rebuild, serialization, and queuing & distribution of tasks without sacrificing timeliness of response. There's a lot of pressure for us to deliver a specific subset of the data in the same request/response cycle, which can be challenging. These are all pretty well known problems in computer science, and we feel pretty confident that we've got solutions for them.

I'm definitely going to keep an eye on your blog though. It's nice to keep in touch with companies that face similar challenges.


Cool!

I agree on the "good to know people with similar problems"

I can see why aggregation of the data can be such a challenge for you. My suggestion revolves around a smart client that can do aggregation through (for eg) javascript on in a browser. Our system is actually polling under the hood (that is how "Comet" works) but our backend never sees the polls.

You have a good point about open sockets being another problem. Fortunately it is thoroughly solved by messaging servers, for instance http://www.rabbitmq.com/ - it handles tons of connections so your backend doesn't have to.

Your requirements are a great fit for a single threaded server such as node.js, twisted or so on that uses that whole polling model rather than many threads. You could have (say) a twisted server that receives messages from the backend on rabbitmq, interprets and aggregates and then forwards to your clients. The send to clients happens via another publish of a message to RabbitMq on a different channel and is un-affected by the number of users watching.

But like you said, solved problems are not worth a whole bunch of engineering effort when things are still running just fine. Just throw more hardware at it and move on to selling what you have.

:)


There are services you can use to push data (I've used Pusher a bit and like them, no affiliation), and it sounds like it will solve most of your problems, since it will get your request rate way, way down. The main benefit, though, is that your problem is trying to emulate pushing data via polling, hence all the cache problems, etc.

Imagine the reformulated problem: You have bidders sending you their bids, and you send N notifications 0.5 times a second. That sounds like a much easier problem, rather than get N requests a second and having to give them correct data.

You can integrate Pusher in five minutes or so. Give it a shot, and if you like it/it solves your problem, you owe me a beer. However, I only know three paragraphs' worth of your problem, so it's just a suggestion.


Actually I wasn't sarcastic. Having built similar sites in the past I'd be really curious what your bottleneck is, if not the database?


So, I should just throw Redis at it? Then it will be ROFLScale and I can sleep at night? I'm trying not to be a complete prick about this, but how could you even begin to make the assumption that our database is our bottleneck without knowing a single thing about our application. How do you even know we're not using Redis already?

Our bottleneck is the computational complexity of our datasets. The nature of reverse auctions is that the dataset changes frequently. The fact that every bidder is ranked on a per line-item basis means there are cache interdependencies, which cause tidal waves of cache expiration when bids are placed. Adding difficulty, there is a lot of pressure to deliver a subset of the data in the same request/response cycle, so that bidders aren't waiting in limbo for their rank. This helps maintain bidder momentum.

We don't currently have any scaling issues that we haven't solved. That's not the point. The point is that suggesting blanked solutions to problems you don't understand is major Dunning–Kruger territory.


So, I should just throw Redis at it? Then it will be ROFLScale and I can sleep at night?

Yes.

but how could you even begin to make the assumption that our database is our bottleneck

Because everything else in your app is parallelizable.

How do you even know we're not using Redis already?

Because you just asked about it a line above.

tidal waves, subset of the data in the same request/response, dataset changes frequently, cache interdependencies

And because of that. If you were using redis for state then none of that would be worth mentioning.


I used to host my own blog on wordpress. I had some web space over at NearlyFreeSpeech (great host by the way). I never got a whole lot of traffic but what I found most frustrating was the multiple updates I had to perform.

I'm not saying anything bad about Wordpress, it's a great piece of software, but as a student I simply don't have time to check if I need to update my core installation or a plugin. Plus I would do all my updates from source (I had tweaked the source code a bit and the automatic method would have overridden my changes).

After reading this article, I decided to sign up for Posterous and give it a whirl. I like it so far.


I had a Wordpress blog for about two years on one of those $4/mo hosts referenced in the post, and I was getting the same slow, lousy performance.

I hosted it myself for a while and also decided I hated the headache. I knew I was going to do some hosted solution for my next blog, so I looked around and ultimately went with Tumblr - a more social ecosystem I expect will eventually surpass what Wordpress has going for it.

Here's the post I wrote describing it: http://pdobson.com/post/16576813337/choosing-tumblr-over-wor...


It's a bit more work up front but I'm really enjoying running a totally static blog directly off S3. Never goes down and scaling is a non-issue and it costs me a few bucks a month.


I heard that Dave whiner was doing this, but can not find any explicit instructions. Would you please let me know any resources?


http://aws.typepad.com/aws/2011/02/host-your-static-website-...

Use a DNS provider that does URL forwarding. I use Joker for this but I'm sure other registrars do too.


I say if the majority of your revenue, or traffic, comes through the content you create on your website, then you should focus on creating the content rather than running the technical aspects of a blogging platform or server.

Especially if the blog generates the required revenue, then for sure, stop spending time keeping the server(s) alive. Pay somebody else to do, and focus on creating great content.


Curious - did you consider Jekyll or Octopress (static generated pages) on heroku or something similar?


I thought I'd be smart and use Blogger to host my blog.

Oh what a fool I was.

I'll be completing my hack to make it work properly sometime in the next few weeks with a full report on what I went through just to fix it.

In hindsight, I should have made my own blog engine or something.


I'm trying to open http://wpengine.com/pricing/ and the tab keeps loading (it seems i cant connect to their cdn.wpengine.com)...

NB Patrick's site is ok

EDIT: now the pricing page loads


I started blogging about 375,000 words ago (about three full-length novels… crikey).

I think it's time. You definitely should get a book out - there's plenty of material. I would preorder.


"required all access of the admin to happen through a proxy that I control"

How is this accomplished using nginx?

Does this mean creating a Socks Proxy and a special URL that's only accessible through the tunnel?


Use a location rule in Nginx for wp-admin denying access to all and then allowing access only to a specified IP address (the machine hosting the proxy).

Alternatively, you can deny access to all on the public site (www.example.com/wp-admin -> 405) and create a separate sites which allows access to the admin (admin.example.com/wp-admin -> 200) with that separate site only listening on a private interface (and/or behind a firewall, etc). That's modestly more secure because IPs can be spoofed, but I was just worried about the "Script kiddie hits every /wp-admin in the world at once with a zero-day" than worried about someone trying to compromise my blog in particular.


Does anybody know how WPEngine compares to PHPFog?


What are your issues with nginx and Wordpress? Seems to work for me and I haven't had to do anything special.


I just moved from a self-hosted WP blog to a free one at WordPress.com. Much happier now :)


Try octopress.


Facebook doesn't run on plain old PHP, it's compiled with HPHP into "highly optimized" C++


It's does not run on "highly optimized" C++ code, it uses C++ as an intermediary target before being complied to ASM. Which makes PHP just another compiled language that happens to work at FB's scale.


So... Facebook doesn't run on PHP.


So he's a web developer that thinks Facebook's use of PHP has anything to do with how wordpress scales?

That's ... weird.

PHP has nothing to do with it (except for maybe the fact that the people who wrote wordpress wrote it in idiomatic PHP)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: