Hacker News new | past | comments | ask | show | jobs | submit login
Ask PG: Cost and effort of running Hacker News?
94 points by knightinblue on April 14, 2009 | hide | past | favorite | 53 comments
In terms of servers and hardware, how much does it cost to keep hacker news up and running each month?

In terms of man hours, how much is needed to maintain and moderate hacker news?




I think the server costs around $350/mo. I don't like to think how much time HN actually takes up, but when I was traveling recently I found that checking in for about 30 min a day was enough to keep things under control.

On a weekday we get about 350k pageviews from about 30k unique ip addrs.


Short, sweet and to the point. Exactly what I was looking for.

Thanks pg! :)


Why do you need to check in every 30 minutes?


He said 30 minutes per day, not every 30 minutes.


Ah, well that explains it I guess :o)


Oh I don't know, perhaps it doesn't include the 4-projector basement monitoring screen in the basement, relaying real-time hits and anomalous voting patterns.

Ahem.


Application rejected! ; )


How much traffic do you see in a month?


I meant to ask, how much network traffic (bytes) do you see? I guess your hosting provider counts it, if you don't.

And, while we're at it, what hosting provider do you use?


From http://news.ycombinator.com/item?id=516108

Only one server:

    Old: 2.4 GHz Pentium 4, 4 GB RAM, 32-bit FreeBSD 5.3.
    New: 3.0 GHz Core whatever, 12 GB RAM, 64-bit FreeBSD 7.1.
PG: "The new server seems to be about 2x as fast. The frontpage renders for me in about 50 msec. But the site should seem more than 2x faster (for logged-in users) because many requests will terminate before being interrupted. There's now enough memory that we can fit all the links and comments in memory at once again. We should be good for another year or so." (traffic: http://ycombinator.com/newsnews.html#15jan09)

Not sure of how or where server is hosted or what else is used e.g. router/firewall/bandwidth/ups/utilities/etc., but if pg/rtm billed for time all other costs would be insignificant.

Maintainance:

A lot less than most people considering it's rtm and pg.


So that brings the hardware costs to about 2000-$3000 per month?

As for man hours, can 2 guys in a basement with nothing else on their schedules maintain and moderate a site like HN?


I don't understand how that makes the hardware cost $2000-$3000 per month.


Whoops, my bad. read that as 24 GB RAM instead of 12 GB RAM (which made me wonder why they would need so much memory)

The server should come out to a little over $1000 per month then?


I don't think so. You can build out that Linux box for just a few thousand total. Check out System76 if you don't want to build it.


I think he's referring to how much it would cost if you paid for a server like that through a hosting provider.


Well, he shouldn't call it hardware costs then :)

A hosting provider gives you lots of services too.


You could get similar services from a colocation company as well. You could purchase a fully decked 1U server for $2000, then pay $50 a month for colocation (that's what I'm doing).


I suspect your hardware costs are a bit off.

Just for an example, you can rent Xeon 5570, a rather nice 2.93 Ghz machine with 12GB RAM, and 2 TB of monthly bandwidth for $724 from Softlayer and others.


You're right. My math was off. I read it as 24 GB RAM, instead of 12 GB RAM.

I was actually checking out Mediatemple's nitro server. Low on the memory and speed end (2.33 GHz and 8 GB RAM) but it seems to be getting glowing reviews for it's customer service.

I'm going to also check out softlayer.

Any other suggestions?


I still think it's amazing that we live in a world where we can store 25,769,803,776 bytes on a computer that we entrepreneurs can afford!

It's causes a huge change in the way people look at Databases- In the past, you needed huge database servers so that you could keep everything stored on disk, and intelligently cached.. Today, I'd imagine that 99% of companies data can be kept entirely in memory.

Stunning, really. It entirely changes how you think about data storage.


> It entirely changes how you think about data storage.

It should entirely change how you think about data storage. What's stunning to me is the number of people who are stuck on "Fully ACID, fully RDMS or it's crap and you will fail and deservedly so."


It all depends on what your goal is.

In health care applications? In financial applications? I would never trust anything less than fully acid with all the bells and whistles.

In a website where people aren't going to be that bothered of a few transactions are inconsistent if some major snafu happens? (status updates get lost in the ether, for example.) The cost isn't worth the benefit of high power.

RDMS is a similar tradeoff.


wait for widely available SSD storage to truly change databases


I was actually talking about developer rather than machine overhead.


Don't forget what it took to actually build this community. It's not exactly a fair comparison.


Not to mention (or maybe you are implying) providing compelling reasons to frequent HN.


If Hacker News is anything like my site, both of which basically need to serve up little bits of text to lots of people, then hosting is cheap.

I bet you could serve 20,000 people a day on an HN-like site for well under $1,000/month on AWS, assuming you are smart about caching and you don't thump the database with every request.

So, even if you were totally clueless about ads and just ran random ad sense, the thing would more than pay for itself.

It's incredibly cheap to run a site that doesn't need much bandwidth. Basically, if you are serving no pictures or media, you should be able to build a website at super-scale in your garage, or using some Cloud Service.


Under a $1000 a month? I'd hope so! That would pay for a couple extra large instances which is way overkill for a simple site like this. It could be hosted on a simple VPS for $50 if you wanted to (well with a different architecture, hosting it all in memory makes a VPS not very efficient).


$50 for 20k users a day? Good Luck. I don't think 512 MB VPS with any architecture could do that.


HN has very little user specific content in it's hot pages. Also, since it's mostly time based, there's a lot of locality. It should cache like mad.

My guessing math puts it at 300MB for the hot data set (1000 100kb pages, 20k 10kb users). With 10 page views per unique, 90% hit rate on the page/user cache and 20% of page views being writes I'd guess average is less than 1 iop a second. Even at high peak to average ratios it's likely within what a single sata disk can do.

I'd say it might be a little tight on a 512MB vps, but possible if you optimized carefully. On a reasonable dedicated machine it should run very well without any particular effort paid to optimization beyond basic caching.


Given the "more" links on the front page, I'm guessing HN is using continuations as state storage mechanisms, and that they're stored on a per-user basis. So, I'd expect the site's RAM requirements to be pretty hefty just on that basis, hence the 12GB of RAM.


350k hits per day is only 4 req/s. Assuming there are typical peak hours, I'm guessing it could reach 300 req/s during peak hours. Nginx on a crappy 128MB server can deliver around 8000 req/s on a static page. Even if there was zero caching, I would hope you could manage 300 req/s.

(and yes, I actually tested this on a crappy 128MB VPS. I also tested out using a rails, merb, and compojure generated pages and was able to reach 500-1000 req/s easily)


Why not? That's only a couple page views a second. Around $50/m can get you a gig of RAM. Use some for memcached, spend the rest running Nginx or Lighttpd.


Memcached is probably a little overkill for 5 req/s.


$50 a month will rent you a 2048 MB VPS.


Is that realistic? Running random AdSense + 20,000 people a day = "more than" $1,000 a month?


On the side note, I wouldn't mind if HN place a tiny link ad somewhere. That would be a good experiment.


Absolutely.

20,000 people a day is definitely worth 35 dollars per day. That would mean, if your ads were impression-based, your CPM of $1.75.

You pay $45 CPM for eyeballs on a site like backpacker.com, which actually has a tight audience, but you're gonna get $1.75 unless your site is just spam.

CPM means cost per thousand eyeballs. I'm talking about impressions here just to simplify things.


The reason I ask is that my site gets about 7,000 unique visitors a day. If 20,000 = $35 a day, 7,000 = $12.25 a day. I make about half that.

I don't recall seeing any articles on HN about advertising options (AdSense vs. AdBrite vs...) or optimization. (Though it's possible I've simply missed them.) Any good resources I should investigate?


What's your website?



It does seem to me like "you're doing it wrong."

I don't know why you are showing adsense ads for dictionaries on your site. I can't imagine your audience is the sort that buys dictionaries. Given you are a slang dictionary, I imagine your users are young. And most young people get their dictionaries online.

It seems like such an oxymoron that you would have an ad to buy something printed on your dictionary website. Your ad sense literally show ads for a printed competitor to your site.

It seems to me you have your adsense set up wrong, or naively. If I go to a page and look up a word slang for a word like girlfriend, you should show me ads for dating! When I look something up in your dictionary, then you start to know who I am and you can show me relevant ads.

As it was, all I get is banners for online colleges and slang dictionaries and a Google ad for Ask.com.

You just need a bit better ads... I guess I was wrong about being able to put up random ads, but I think you could make more if you thought about it more.


You just need a bit better ads.

I'd be quite happy with better ads, but my understanding is that one's control over what AdSense ads display on one's site is limited. Using the "competitive ad filter" you can explicitly block specific ads, but there's no opposite analogue: you can't say what kind of ads you want to run on certain pages.

The ads that run are up to Google's discretion. It does its best to determine what the page is about and shows ads accordingly. Perhaps I need to tweak the template text, meta description, and meta keywords - but again, Google has final control over what ads are run.


I think he just needs to place the ads properly. Those link ads, in my experience, are wonderful.


Do you think the ads are improperly placed?


In my own opinion, yes. If I were the owner of the site, I would use a Link Ad just below the The Online Slang Dictionary with the same background color and white text color.

And I would add another Link Ad below Welcome to The Online Slang Dictionary this time with a white background.

Then I would remove the big vertical ad and place it in a box similar to the "Subscribe to updates" and "Bookmark or share" boxes and label it with maybe "Other Resources" -- after all Google text ads are relevant :P

Just my own 2 cents... :)


Thanks for the feedback!

I actually used to have the big vertical ad ("wide skyscraper") in the right sidebar (where the "Subscribe to updates" etc. boxes are) but it performed terribly. Putting it where it is now increased clickthroughs by something like 10x.

BTW, Google's AdSense terms "[prohibit] placing ads under misleading headings such as 'resources' or 'helpful links.'" https://www.google.com/adsense/support/bin/answer.py?answer=...


This site could easily make $10k/mo+ with ads.


http://ycombinator.com/newsnews.html - over 25,000 ips per day

http://ycombinator.com/images/2yeartraffic.png - Maybe I'm reading this wrong, but does that image show 300,000 pageviews per day?


Sure looks like it to me - the peaks on the graph are too close together to be going by months.


If it were a very cool clustered, load-balanced setup with analog monitoring gauges set up in a nuclear bunker, then I'm guessing we would've heard about it by now :) So it's probably just your good old boring rack server.

Anyone else care for a guess ?


What time zone is the hosting centre located in ? I always wondered about those posting times (x hours ago, y minutes ago, etc).


Time zone doesn't matter. The clock starts from when you submitted an article or posted a comment and it's the same for everyone.

So if you submitted an article 10 min ago, it shows as being posted 10 min ago for you and everyone else.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: