Costs of running a Python webapp for 55k monthly users

lacker · on Sept 4, 2020

It's tough to generalize from things like this because "a webapp for 55k monthly users" can mean drastically different things. A user could be loading a few static pages, or that same user could be opening websocket connections or making requests that require expensive database queries. Your average user could stick around a minute, or an hour. Different apps can easily have per-monthly-user infrastructure costs that are two orders of magnitude different.

For comparison at the other end of the scale:

Facebook said that in the first quarter it spent $3.7 billion on data centers, servers, office buildings and network infrastructure.

https://www.datacenterdynamics.com/en/news/facebook-signific...

That's something like $8 per user per year. If you had that level of expenditure for your 55k-user app, it would cost $400,000 per year, 200x the cost given by this blog post. It isn't that Facebook is wasting the vast majority of their infrastructure, though, it's that the application does a lot more for each user.

ramraj07 · on Sept 4, 2020

The Facebook comparison is nowhere close though. Firstly, even purely from a services perspective, the per user compute needs per person are quite insane if you take a step back and think about what all has to happen in the background. Then you have to think about how much each user uses facebook and instagram, not to mention other services like video calls and live. Further, Facebook also ostensibly spends a bunch of the resources directly making money by utilising the resources on their ad biz. And then you have all the research and prototype work Facebook does. Finally this also includes office buildings and actual machine expenditure which ostensibly will pay dividends in future years, plus being a factor that OP didnnot even include in this discussion. I'm fairly confident that for the same number of requests of compwrable complexity Facebook spends significantly less than what this person is spending.

quietbritishjim · on Sept 4, 2020

All of this just seems to be agreeing with the parent comment's point that different websites do different amounts of work per user.

mannykannot · on Sept 4, 2020

It does not just agree; it suggests explanations.

I suspect that one particular aspect of Facebook's operations - the nation-state level of spying on its users and the people they mention - accounts for a significant part of the difference.

srtjstjsj · on Sept 5, 2020

This app appears to be dedicated to incrementing counters in a database.

I don't even understand why they need a VPS at all. It seems simple to run off of firebase and cloudflare for the UI.

hmottestad · on Sept 4, 2020

That seems to include all the facebook services including instagram.

waynejwerner · on Sept 9, 2020

That's intriguing. That means that they make significantly more than $8/uyr in selling your personal data to advertisers.

whatch · on Sept 10, 2020

It's interesting if they count people that "have no account" data of which they also monetize.

Traubenfuchs · on Sept 4, 2020

8$, just for infrastructure? So their costs per user is even higher?

-> Facebook is making far more than 10$ per user? Jesus Christ.

srg0 · on Sept 4, 2020

Facebook average revenue per user in 2019 was USD 29.25

Source: https://www.statista.com/statistics/234056/facebooks-average...

ForHackernews · on Sept 4, 2020

> It isn't that Facebook is wasting the vast majority of their infrastructure, though, it's that the application does a lot more for each user.

I don't think we can be sure Facebook isn't wasting money on their infrastructure. They've shown themselves to be pretty wasteful in the past.[0][1] They're just rich enough to get away with it.

Facebook's application may be doing a lot more to their users than a simpler app, but I doubt it's doing that much for users.

[0] https://www.cio.com/article/3197554/why-is-facebooks-ios-app...

[1] https://jaxenter.com/facebooks-completely-insane-dalvik-hack...

dx034 · on Sept 4, 2020

But if they can choose between saving 10% on infrastructure or adding 5% ad revenue (through more optimizations), the additional revenue is worth more.

Just running Facebook, Instagram & Co as a platform would probably be much cheaper than $8 per user.

ForHackernews · on Sept 4, 2020

Absolutely. Profligacy is often the rational course in a capitalist system.

asdfasgasdgasdg · on Sept 4, 2020

In any system, there will be a tension between shrinking cost/benefit, and spending more to achieve better benefit-cost. This has nothing to do with capitalism. It's a fundamental aspect of systems where there is a non-fixed relationship between investment and return (i.e. almost all systems of any kind).

ForHackernews · on Sept 4, 2020

The notion that "[financial] investment and [financial] return" are the only two factors worth considering in "all systems of any kind" is an artefact of capitalist thinking.

asdfasgasdgasdg · on Sept 4, 2020

That notion is something absent from both text and subtext of my comment. You are the one who inserted it. Investment and cost come in many forms other than the financial.

C1sc0cat · on Sept 4, 2020

That's assuming you have the add inventory a some point reducing costs make sense

thrasos · on Sept 4, 2020

so true

TekMol · on Sept 4, 2020

It always makes my head dizzle when I read and hear how much people manage to bloat their stack because they think they need to "scale".

I have web applications running that reliably serve 50k users per day and cost me $10/month, running on a single, cheap VPS.

threeseed · on Sept 4, 2020

It always amazes me how people are so free to judge other's situations.

It's great that your web app only costs $10/month but others may have web apps that are more computationally intensive e.g. video processing or ML inference etc or simply can't join everything they need at runtime.

And it's great that you're willing to deny those 50k users a day access to your service when that cheap VPS inevitably falls over. But others may be monetising that traffic and will want a HA solution so their revenue isn't impacted.

All of those add complexity and cost to an architecture.

maxk42 · on Sept 4, 2020

TekMoi is right. I have an Alexa top 6k site which is vastly more complicated (media hosting, load balancing, multiple VMs, DDOS protection, transactional emails, automated backups) which costs $200 / month on AWS.

The fact that this person is spending nearly as much to support 50k users a day as I do to support more than 4 million cannot be hand-waived away by "people are so free to judge other's[sic] situations". The matter is worsened by the fact that the application is so simple that it doesn't even support user accounts. There is room for discussion here about efficiency in application architecture. More importantly, an article billing itself as "Costs of running a Python webapp for 55k monthly users" is silly because there is no way this is representative of anything. I'm afraid new hackers will be scared by the high costs listed here and be discouraged in their own efforts.

senko · on Sept 4, 2020

If you support 4 million monthly visitors on a media site, and have multiple EC2 instances running, I'd love to see a cost breakdown structure, because in my (obviously incomplete and possibly naive) calculations, the bandwidth alone would cost more than $200/mo.

maxk42 · on Sept 4, 2020

CloudFlare covers the media bandwidth costs for a mere $20 / month. The uploading and media conversion is the difficult and costly (in terms of CPU) portion.

chmod775 · on Sept 4, 2020

>CloudFlare covers the media bandwidth costs for a mere $20 / month.

Your media files must be extremely small.

I'm guessing they're less than 20MB on average, because a) CF hasn't shown you the door yet b) they don't even cache anything bigger than half a gigabyte.

maxk42 · on Sept 4, 2020

Also to be clear I'm talking about image uploads, not video. Still more complicated than this app.

cinquemb · on Sept 4, 2020

Similar situation (sub $1k/month) to the setup im doing for a start up in indonesia (top 7k in alexa, top 150 in indonesia), except we have to pay extra for media because we need to process the logs (long tail pdf/data/docx hits, and we need to control the dns and how other domains route to use so we cant use cloudflare). And still plenty of room to cost optimize.

Another consultant that came in and try to do this, made our costs go up by 10x per month… so i'm not surprised when I see stuff like this here…

The knowledge of ones tools available at ones finger tips and the relative costs of such seem to make the difference for these things.

pembrook · on Sept 4, 2020

If $171 month is discouraging, let me reassure them.

There’s statistically a 95-99%+ chance you’ll never get to 55k monthly users with your app so don’t worry!

zeku · on Sept 4, 2020

Let's leave this kind of snark out of this community. It's comments like this that slippery slope a community from helpful to harmful. You see it each time a reddit community gets too large.

teclordphrack2 · on Sept 4, 2020

I don't think its snark. Its, don't worry about optimization to soon.

runbyfruity · on Sept 4, 2020

No, what's harmful is "oh, just spend $50,000 on managed Kubernetes to run a Django web app". That costs real time and real money and makes young engineers think that a phpBB forum is impossible with a five-digit AWS bill.

fxtentacle · on Sept 4, 2020

Both you and TekMoi should probably value your own time higher. The cost savings are great, but once you spend 4 hours on configuring a database server, that'll probably be $200+ worth of your time and, thus, wipe out most of the savings.

anang · on Sept 4, 2020

It doesn’t cost anything to spend 4 hours of your time doing anything, so it doesn’t really wipe out any savings. Reducing a bill from $400 month to $200 is real money, not some theoretical time/value judgement.

And, as others have mentioned, there is enormous value in knowing how to operate on a fairly lean tech stack. It makes it so much simpler to scale effectively while keeping costs down.

darekkay · on Sept 4, 2020

This is only true, if you do not value your free time. In your example, you've spent 4h to save 200$, so your work was worth 50$/hour. A freelancer with a 100$/hour rate might do 2 hours of work instead, spend the money, and gain 2 hours of free time.

There are other factors of course, but in general, many people come up with a rate for their own free time (which is often higher than the actual rate they charge clients)

anang · on Sept 5, 2020

It’s not only true if you don’t your free time. It’s only true if you’re a freelancer that is turning down hours at a higher rate than what you’re saving.

Many people working on products, both on their own and within a company, aren’t turning down other profitable work to optimize existing solutions.

If I watch a two hour movie instead of spending two hours saving $100/month, it doesn’t matter how much I value my time, no one is paying me $100/hour to watch a movie.

darekkay · on Sept 5, 2020

There's a difference between the initial example and watching a movie. That's what I summed up with "there are other factors (than money)". If I'm only interested in the _outcome_ and I treat the way to achieve it as work, I definitely weight the time cost vs benefit (and I am not a freelancer). If it is something I enjoy doing (which may or may not be true for the initial example), I'll take this into account as well. Time is a limited resource, and I treat it as such.

marcus_holmes · on Sept 4, 2020

Except that now you know how to configure a database server, and know how your database server is configured. So if you ever get a problem with the database in the future, you'll know how to solve it faster. Which will save you time and money in the future.

And you're less likely to make mistakes like upgrading your server instance to try and solve a problem that can't be solved like that.

And the more you do it, the cheaper and faster it gets. That knowledge and skill has value.

maxk42 · on Sept 4, 2020

I value four hours of my time at considerably more than $200 and the reason I'm able to do that is because I know things like how to configure a database server.

But your argument is nonsensical because a one-time investment of time (much less than four hours for me, but I've been doing this for 20 years) can save you several hundred dollars a month. AWS AuroraDB, for instance, (which I also know how to configure, by the way) has much higher latency than a hand-rolled instance and will cause bottlenecks throughout your code in a DB-driven application. If I hadn't experienced the difference firsthand, or had failed to profile my app's performance adequately, I might assume I need to solve the problem by spinning up more ec2 instances to distribute the load. I've had the misfortune of working with a company that had exactly that problem and knowing how to spin up a new DB server saved the company thousands of dollars a month and took considerably less than four hours. Transferring a 3TB database to a new server without downtime did take considerably longer, however, but I was being paid hourly anyway, and it was still a worthwhile investment for the company which saved considerably more than my fee.

Any tradesperson should know their tools. A programmer is no different and if you don't know how to use your tools because your "value your own time higher" then thank you: you're the guy who ends up getting me called in to fix things at a much higher hourly fee.

rektide · on Sept 4, 2020

the main issue I have with the DIY mentality is that, for me, it's endless, and even more so, it's an unpaved path, ad-hoc, with few integrated well defined paths. we've got lots of open source software, but the ops of being online is hard fought experience.

* `apt-get install postgres-server` would have worked fine for my needs on my VPS. oh but i need roles, so let's start an ansible playbook. maybe let's tweak some settings. fine, still all short, easy to do.

* ahhh i should probably have backups. how am i going to manage that storage, where is that going to live?

* then i introduce a new feature & my database is running slow. explain query helps, but i also could use some metrics for these boxes, so probably need to start thinking about prometheus & node-exporter, &c.

i am radically in favor of a) personally facing these challenges and b) open-sourcing the operational knowledge & tools for setting up AND OPERATING systems.

yet at the same time i also think spending $171/mo for a year is an exceedingly wonderful option to have on the table. running my own servers is, to me, a lifelong project, something i want to deeply invest in. there's plenty of ways to go about it that aren't so arduous (k8s+postgres-operator+rook+tbd monitoring+tbd directory-services), but that willingness to keep engaging, supporting, maintaining, scaling things can be a very serious concern that extends well past the time it takes to set a database up: it's an ongoing "giving a shit" burden even when (seemingly) working fine.

being willing and able to hack through is great, and i am all for the coallition of the willing who elect to march through, hopefully not getting bogged down along the way. but wow if you are trying to start a business, it sure is nice being able to pay someone to spin up, back up, monitor, scale some services for you.

i hope some day "we" are better at such things, systematically, i hope open source ops helps give us better paths to doing these kind of things easily, safely, observably, resilliently. we're not there yet. but wow, this challenge to me- how we move open source from an older "software" model to an online service model, that empowers people to set up online systems as easily as opening an editor, that's the challenge at the heart of open source today. it's one that needs a lot more effort, a lot more work, such that we have good ways to stand up & keep up a database server.

strokirk · on Sept 5, 2020

Where and how did you aquire this knowledge?

nix23 · on Sept 4, 2020

>$200+ worth of your time and

Yeah, from your perspective, for others monthly savings of 10$ is allot, and not everyone earns 200$ for 4 hours of work.

fhennig · on Sept 4, 2020

For OP it might be a good idea though, given that the infrastructure is their biggest cost and they have not revenue.

salawat · on Sept 4, 2020

Being well versed in setting up your own database server bereft of cloud provider hand holding easily pays for initial time investment over time. I don't think most understand just how much a mastery of the basics is capable of generating in value when you're essentially vendor lock-in proof. There is so much blindness to voluntary hanging of one's arose out the window created by vendor overreliance.

brendawalsh · on Sept 4, 2020

Yes, without clear leadership, it is easy for a dev team to flounder around in a cloud provider’s offerings.

ramraj07 · on Sept 4, 2020

It's 50k MONTHLY users. It's just 3000 a day

bartread · on Sept 4, 2020

It says in the article that the author gets 34k daily users, and 50k unique users/month. It would have been clearer if the author had talked about sessions (which are therefore > 1M/mo) for sure, but you're still making a very big (and invalid) assumption.

EDIT: Please disregard the above. I need an eye test, or maybe just to put my glasses on! Daily users are 3.4k (3400), not 34k. My apologies, I take it all back!

donatzsky · on Sept 4, 2020

It's 3.4k (3400) daily users. Not 34k (34000). Dont worry, I nearly missed the dot at first, as well :)

bartread · on Sept 4, 2020

I'm such a klutz - sorry. This is what I get for reading HN when I'm still in bed and without putting my glasses on first. This is a terrible habit that I need to break.

tomcam · on Sept 4, 2020

Username now in question

bartread · on Sept 4, 2020

Tough to argue against that under the circumstances.

matt42 · on Sept 4, 2020

And 2.5 per minute

runbyfruity · on Sept 4, 2020

Or 0 per minute and 25,000 per hour for two days a month. Traffic can be bursty; don't assume that X/month means they're getting exactly X/30 per day..

bahularora · on Sept 4, 2020

What are the right resources, you would suggest someone if he had to setup his servers properly. Will really appreciate if you can refer some books/videos/articles. Thanks.

maxk42 · on Sept 4, 2020

If you're talking about serving many requests cost-effectively, then really the problem is not the servers (except over-provisioning, which is rampant. Learn to use tools like AWS' auto-scaling system instead) - it's the code.

If you understand the basics of algorithmic time complexity (that's your Big-O notation) and profiling your code then you're ahead of 98% of other developers in practice. I'm constantly amazed at how many developers think adding more libraries, newer frameworks, or more layers of tooling will magically speed up their code because "it's so fast". If you actually time things you'll find out doing it the "slow" way is frequently an order of magnitude faster.

tomcam · on Sept 4, 2020

Wow! Congrats, and what is your site? Would love to read more about it.

bigiain · on Sept 4, 2020

To be fair, the post show that about 2/3rds of that spend is from the decision to run twice the needed capacity to be able to do green blue deployments, and to cloud host their metabase analytics.

And it explains there's currently zero revenue.

So it seems fair to judge the situation there based off those pieces of information we've been given.

As I posted elsewhere, the OP's choice to run dual redundant green/blue capable instances and cloud hosted metabase might have good reasons, but right now those reasons are not "wanting a HA solution so revenue isn't impacted"...

ianamartin · on Sept 4, 2020

That's not how green/blue deployments work. You don't keep both colors up unless you have completely failed to understand the concept.

Green/Blue is all about saving resources and costs, not keeping them around. You misread the cause here. It has nothing to do with deployment strategies.

bigiain · on Sept 4, 2020

That's not how I read what the OP's doing in the article. Sure, maybe he's not doing "proper blue/green", but that is what he uses to explain running a duplicated pair of web/app servers full time...

yread · on Sept 4, 2020

I have a web app that's struggling if there are more than 2 concurrent users per CPU core. It's displaying incompressible large resolution images with <100ms latency

EDIT: not sure why I'm downvoted, I'm just presenting my use case. They are multigigabyte images encoded with custom wavelet compression that are cut into tiles (think google earth), each user needs 5-10 tiles every second

mping · on Sept 4, 2020

My guess would be that your webapp is serving images in a blocking fashion, meaning every time a use requests an image it will fetch the image for the user AND block the http serving thread until the image is uploaded. Can you provide more context (eg: tech stack)?

yread · on Sept 4, 2020

What's blocking is number of http connections between browser and server. Most browsers only allow 6, each tile takes about 100ms so getting 10 in under second doesn't always happen.

mping · on Sept 4, 2020

But this itself wouldn't cause the "struggling" you mention with more than 2 concurrent users per core

durkie · on Sept 4, 2020

Do you know if the old open street map trick of having multiple tile servers (that are each just aliases of the same server) still works? I think this was how they tried circumventing the 6 connection limit

NavinF · on Sept 5, 2020

That's no longer necessary thanks to HTTP/2

Your webserver still has to spin off a thread for each request if you want to do substantial CPU work for each request, but rest assured you'll get all the requests at once from the browser. Not 6 at a time like in the dark ages

The RFC recommends at least 100 streams. See SETTINGS_MAX_CONCURRENT_STREAMS https://tools.ietf.org/html/rfc7540#section-6.5.2

brendawalsh · on Sept 4, 2020

Have you tried combining the tiles on-the-fly as image sprites?

https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Images/...

csande17 · on Sept 4, 2020

That's a lot of CPU usage per user! Is it doing some super-intensive process to generate the images on the fly?

justinclift · on Sept 4, 2020

Is it doing anything weird or fancy with those images?

If you're just serving them (eg no image manipulation), that sounds like there's a problem somewhere.

matt42 · on Sept 4, 2020

What http server do you use ? Are they static images ?

TekMol · on Sept 4, 2020

The article does not mention video processing or ML. It describes their setup for what they describe as a generic web app.

FrozenSynapse · on Sept 4, 2020

If you want to provide uninterrupted service to your clients you’ll have to spend some $. You want to have redundancy, machines hosted in different different locations, backup prod servers, monitoring, analysis tools. Even if it is for 1k monthly users, if you want reliability - it will increase the costs.

TekMol · on Sept 4, 2020

I beg to differ.

In my experience, the complicated setups that are justified by the argument of "reliability" have more downtime then a single VPS. The reason is probably that there are more moving parts and more has to be maintained / can go wrong.

These days, a single VPS in the right datacenter has excellent uptime.

tristanperry · on Sept 4, 2020

Agreed. I'll probably be downvoted, but these setups strike me as people who prefer to drink the cool aid than be pragmatic and only use what they need.

I've also had very high reliability rates with a single VPS. They've actually given me less downtime than AWS services at times.

bigiain · on Sept 4, 2020

At work, I aim for four nines. (We put three nines in the legal paperwork).

I can't hit four nines reliably with single VPS platforms on my typical workloads, I need load balancers and redundant app servers. I could quite likely hit three nines using single VPSes. But if a client wants 99.9% SLAs, they'll be paying for HA and I'll deploy redundant ec2 instances, multi region RDS, and an ELB. And charge them 3 or 4 times what the OP is spending for it. (And I'll almost always deliver 99.99% availability.)

For my stuff or friends or people I'm doing cost saving favours for, I'll explain how much extra it costs to guarantee less then an hour of downtime a month, the realistic expectations and historical experience of how much downtime an non-HA platform might have in their use case, and often choose along with them a single VPS (or even dirt cheap cpanel hosting) while understanding and accepting the risks associated with saving upwards of a couple of hundred bucks per month.

CapriciousCptl · on Sept 4, 2020

I think ec2 gives 99.99% availability in their SLA, no need to scale across regions or even AZs. Multi AZ RDS is 99.95%. We have a simple ELB/EC2/RDS/S3 stack on us-east-1 and need high availability for a very small amount of users and run very cheap.

karlerss · on Sept 4, 2020

A single VPS set-up might be OK for serving content over web, but in my experience, the pain begins when your software starts doing async processing - long-running cron jobs, queue processing. If you're doing it on your web server machine, there will be downtime.

I know this, because I have gone through these issues with each of my projects. Just recently an infinite loop bug in a cron job ground my "single VPS" setup to a halt (and took the web server with it).

smt88 · on Sept 4, 2020

I have a VPS that has not gone down in 5 years.

It still has a redundant slave because I'm not going to bet my reputation on everything going right.

jmnicolas · on Sept 4, 2020

Me too, with the cheapest Kimsufi server from OVH (something like 3€ a month).

To be fair I can't be sure because a less than 5 minutes downtime would probably go unnoticed, but the fact is I never hear about this server.

httpsterio · on Sept 4, 2020

> I beg to differ.

> In my experience, the complicated setups that are justified by the argument of "reliability" have more downtime then a single VPS. The reason is probably that there are more moving parts and more has to be maintained / can go wrong.

> These days, a single VPS in the right datacenter has excellent uptime.

Again, maybe in your experience but that's not universal. There's literally no redundancy with running everything off a single VPS and if that datacenter has network or hardware problems, then your service is down.

Is redundancy necessary for the scale of OP's app considering it provides 0 income? Most likely not, but that's a decision they've decided on and there's nothing wrong with that.

What does excellent uptime mean in your book? With Digital Ocean's AM2 region I had regular downtime every few weeks and while I'm alright with it, if I had another VPS in another datacenter it would've had next to no effect on the customer experience. But an hour or more of downtime every two weeks isn't excellent.

sweeneyrod · on Sept 4, 2020

If Digital Ocean is giving you an hour of downtime every two weeks (two 9s and a 7) then they're breaking their SLA (four 9s).

TekMol · on Sept 4, 2020

    What does excellent uptime
    mean in your book?

Something like less then an hour of downtime per year.

Over the last years, across multiple datacenters, I have seen maybe one or two short downtimes per year. None ever lasting more then 5 minutes.

FrozenSynapse · on Sept 4, 2020

https://aws.amazon.com/message/41926/ this lasted hours and affected almost everyone using us-east-1, large portion of internet was unavailable because they had no multi-region setups.

TekMol · on Sept 4, 2020

That was a "S3 Service Disruption". A perfect example of a problem you do not have with a single VPS setup.

tristanperry · on Sept 4, 2020

Two Hetzner CPX31 boxes sounds like it'd do just fine here too, providing the redundancy you mention for a fraction of the cost. Or get the boxes from different companies, for the same sort of overall price.

Yes some of the other tools could arguably be worth paying for, but if the author's concern is that he's short on money and $140 is a lot, why didn't they KISS and only use what they need? Then scale as and when needed in the future.

Y'know, do what HN regularly preaches?

bigiain · on Sept 4, 2020

My guess is he's doing resume engineering.

And $140/month is pretty good value there probably... Even if that's just being able to point potential employers/recruiters at this blog post as evidence of experience building and running an HA website with more-advanced-than-free-Google-Analytics user behaviour tracking.

goldcd · on Sept 4, 2020

I was just thinking along similar lines.

Article mentions in a few places that money could be saved - but lowering the cost doesn't seem to be the driver.

Makes sense to me that if you're building the site as a hobby/practice/demo it makes sense to "build it properly" - and it's fun.

bigiain · on Sept 4, 2020

If your 55k MAU want uninterrupted service, they need to be paying for it (in dollars or monetisable attention and/or privacy).

On a site currently generating zero revenue, I hope the OP is happily enough paying most of that $145/month as a learning experience or for resume bullet points (which are perfectly valid was to spend your money). They've admitted elsewhere in the comments that the two $40/month droplets are way oversized (from an attempt to solve a problem that turned out not to be droplet size/resource related) - so without redundancy and without AWS hosted metabase, this would be about $100/month less expensive to run.

I still think that's over provisioned or under engineered. Like others have commented, I'd be surprised if the features you can see on the site require any more than the $15/month the FAQ claims it costs to run, plus perhaps the $10/month Discus expenses. That seems about where a hobby/side-gig project should sit for a lot of devs before you start thinking about how to make it pay for itself... YMMV, especially if you're not comfortable earning at least junior dev salary already in some reasonably well paying part of the world.

bigiain · on Sept 4, 2020

I get what you're saying, but I think you're being a bit harsh.

In my mind, choosing a dual redundant prod platform so you can do blue/green deployments is _totally_ unnecessary for a 55k MAU site generating no revenue. Same with cloud hosted metabase. You could run that on your own hardware - a spare laptop or probably even a raspberry pi for effectively nothing.

On the other hand, a hobby project/side gig where you can demonstrate real world experience in those two things could _easily_ pay off in the first week of a new job it helps you land.

If it's _just_ that extra $100/month they're spending there, it seems difficult to justify. If it's commercial experience doing that which is helping him try and land a job paying 20k or more extra per year? That's totally money well spent, in my opinion...

maxk42 · on Sept 4, 2020

Given that he's complaining about the price I don't think the wasteful costs here are justified. I'll also note the article was just edited and the costs are now up to $171 / month.

This app could easily be run on the AWS free tier, although even in the paid tier he could probably be managing a lot more than this workload for under $40 / month. (That's for two servers - which as has been pointed-out is wholly unnecessary for an app of this size.) The price he's paying is presently listed at $95.

bigiain · on Sept 4, 2020

Is he complaining much about the cost?

Maybe I read his conclusion differently, but he said "would be peanuts" about the cost and "The bigger issue is that on the revenue side there’s a big fat zero."

Seems to me he's acknowledging he's built a thing that requires generating revenue to support itself, but that he's neglected the revenue generation part of his project, rather than complaining too much about the price of running it?

I'm mostly agreeing with you (and tristanperry and TekMol), but I'm probably being more sympathetic and ascribing un-supported motivations for why he's happy enough building it this way and spending this much money to run it. (Probably because I've been there before myself, and sometimes that expensive hobby project has paid off, sometimes it hasn't. I've never spend so much I've seriously regretted any of my failures though...)

TekMol · on Sept 4, 2020

If we were on a site called "CV News" that teaches people how to get jobs in dull cooperations, I might agree with you. On the other hand, I would not frequent such a site. So we would never have had this discussion.

iakov · on Sept 4, 2020

You can apply the skills that you learn in any job, not just the dull ones. The skills that you learn by implementing this kind of setup are valuable to lots of interesting jobs I'd say.

bigiain · on Sept 4, 2020

Yeah fair cop...

tsumnia · on Sept 4, 2020

Not to sound mean, but if there aren't posts/blogs/whatever explicitly telling people how to minimize costs... then they'll continue to follow the ones that make them pay $100+.

aahortwwy · on Sept 4, 2020

Don't do blue-green deploys when you have no revenue. Downtime costs you nothing and at 3,400 visitors a day you'll be lucky to drop a single request while deploying.

Pre-compute and cache to reduce your need for beefy servers.

When you can run something on your machine or in the cloud, choose your machine.

EDIT: It's less about saving money and more about not spending it.

EDIT2: Forgot to mention: use SQLite.

stevekemp · on Sept 4, 2020

Caching is, in general, a good thing. But it's worth thinking about how you do it. I recently sped up a large system by dropping the use of redis entirely - because it was being badly used.

During a single request there might be 50+ cache-lookups, each taking a round-trip to a remote redis server to fetch a single key at a time. Batching those up to a set/hash would have been more efficient, but the codebase had evolved in such a way as to make that difficult.

Instead of making 50+ redis fetches it turned out that just fetching all the stuff from the database was faster.

(There will be refactoring to batch up the key fetches, but for the moment there was a measurable increase in performance under current loads just by removing redis.)

aembleton · on Sept 4, 2020

Couldn't you have cached locally using something like ehcache?

stevekemp · on Sept 4, 2020

Perhaps, yes. It was the latency of the network calls that was killing performance, so something local, or even redis on localhost, would have been better.

limteary · on Sept 4, 2020

Why use SQLite?

My postgres process doesn't come close to using up enough resources to push me out of even the cheap VPS tiers and I don't have to worry about locking if there's a heavy write load.

theshrike79 · on Sept 4, 2020

Backing up an SQLite database is as simple as copying a single file. As long as nothing is writing in the file, you're good.

mattmanser · on Sept 4, 2020

That's not a reason.

Plus setting up a nightly back up of any SQL database, regardless of creed, is like a 10 line cron job.

ClumsyPilot · on Sept 4, 2020

Some developers have never written a cron job.

I think the point here is, SQLite setup would provide satisfactory results at this scale.

If the choice is betweeen a PAAS database offering and SQLite, you can pick SQLite. If you have skills / are prepared for managing dbserver yourself, then do that.

aahortwwy · on Sept 4, 2020

Yeah, fair point. From a cost standpoint it's more about using the same server for your application and your database.

chillfox · on Sept 4, 2020

SQLite Really is a bit of a hidden gem.

rapsey · on Sept 4, 2020

Not really hidden when it is one of the most widely deployed pieces of software on the planet.

bigiain · on Sept 4, 2020

Sure, but as an underlying dependency for iOS/Android, it's pretty well hidden from casual users...

krtkush · on Sept 4, 2020

That's true for even a casual website user. They have no idea or care about the underlying database.

Every app developer is aware about SQLite.

chillfox · on Sept 4, 2020

I guess not in absolute terms, but it’s mostly overlooked for web apps.

TekMol · on Sept 4, 2020

Ok, here is a quick tutorial:

Step 1:

Copy Peter Levels stack:

https://www.nocsdegree.com/pieter-levels-learn-coding/

If he serves hundreds of thousands of monthly users and makes a million a month with a single VPS, you certainly won't have scaling issues when you start out or just have tens of thousands of users.

Step 2:

If you really scale beyond that, resist the urge to bloat your stack. Think long and hard about every piece you add to it. Really understand each piece you add to it. Don't fall into the trap of paid services. Don't fall into the trap of "best practices".

renewiltord · on Sept 4, 2020

Haha, the man is clearly a beast of a product designer and executor. I don't think I would succeed with the same tools.

Honestly, spending $200/mo is insignificant to me. And I'm pretty happy to answer the question of "Why can this guy build this thing on a single VPS and you can't?" with "Well, because he's better than me".

I can be up in 15 mins on Heroku with a Rails+React web app. And in an hour have a thing. Or for a static no-login thing, faster with Netlify.

But it doesn't matter. Because I never made a product as nice as the one he made. If the outcome is I'm -$200/mo that is irrelevant to me. If the outcome is I'm +$50k/mo that is very relevant. So I'm going to optimize for how I can do the latter.

chillfox · on Sept 4, 2020

I don’t know how many times I have seen “best practices” used as excuse to avoid thinking. Usually by people who don’t even have the problem that the supposed “best practices” is meant to solve.

tsumnia · on Sept 4, 2020

While this article has steps in the right direction, it's a lot of "use this thing". I operate a simple Flask/PostgreSQL webapp for CS Education through the luxury of my university. If/When I graduate from my program, what are the aspects I can do to minimize my costs for hosting the app? Of course I'll look for best approaches but why does that need to be forbidden knowledge known to a select few?

Its an aside, but the mentality of "let them figure it out" is a major issue in education. Foundational knowledge should be easy to acquire so I can worry about higher level thinking issues. Literally spending hours trying to figure out how to set things up through hours of Googling doesn't really help that, nor does it promote the "figuring it out" people think it does - its just stumbling upon the right set of commands that let me move past this particular hurdle.

From the devops perspective, what about telling someone how to set up their own server to do/minimize X is so taxing?

richiecute · on Sept 4, 2020

I run a cheap website and here is what I do

- cloudflare free tier for caching, DNS, page rules, etc

- run everything on one VPS(digital ocean, linode, etc) pick cheapest that has specs you need

- any non-trivial storage (media, big files) move to Backblaze B2 it's cheap (you can use free tier cloudflare workers to redirect to B2 for free bandwidth due to Bandwitdth Alliance)

- free static page from Netifly (I can redirect to this with cloudflare in case my VPS falls over or something to provide info/links)

- If I want to look at logs or something I rsync it my local machine (if I cared I could set up a process to push logs/backup etc to private B2 bucket)

You may not need exact same setup, I am optimizing for caching and cheap storage because my site stores/serves lots of media files.

yabai_yatsu · on Sept 4, 2020

>Literally spending hours trying to figure out how to set things up through hours of Googling doesn't really help that, nor does it promote the "figuring it out" people think it does - its just stumbling upon the right set of commands that let me move past this particular hurdle.

That's basically all of software development for your entire career. Never not had a day or a week not like that.

tsumnia · on Sept 4, 2020

That is true and I recognize the purpose of being a proficient Googler; however my concerns stem out of the idea that learning many of those skills are not taught at all or expected to be learned in situ through programming assignments.

Here are examples of what I mean:

- An undergraduate Networking/Security course may not provide practice on appropriately salting passwords. It is merely discussed as part of some larger conceptual model. Students are browbeaten in earlier courses to not simply copy/paste code they find on the internet

- Debugging practice has to come from the student's own generated code, but if they made a mistake, they already are showing they do not fully grasp the material. There have been efforts to explicitly train debugging [1] but they are still in early stages of researching their benefits.

[1] The Code Mangler - https://dl.acm.org/doi/pdf/10.1145/3017680.3017704

yabai_yatsu · on Sept 4, 2020

Yeah there should be debugging classes, googling classes, and how to orient yourself in a massive, existing codebase classes.

tsumnia · on Sept 5, 2020

I don't think they need to be explicit courses, since that means making credits and charging students more. Rather, my research is about providing exercises specifically targeting those lower level skills. Since many of them are only a fraction of the "programming problem", they are do not require the expected hours and can be completed quickly. My hypothesis is that doing these types of problems will help reduce the time on task for coding.

srtjstjsj · on Sept 5, 2020

If I have a "million a month" business (which is coming from rich recruitijg fees on a trivial website, not amazing tech), no way am I sweating $171/mo in server bills.

young_unixer · on Sept 4, 2020

> PHP (He doesn't use any frameworks like Laravel)

I wouldn't want to read (or write) that code, but thanks for the article.

paul_f · on Sept 4, 2020

Well, aren't you special. How about staying on topic and not dropping in this, what, developer virtue signaling?

undergrowth54 · on Sept 4, 2020

> staying on topic

A major subtopic of this thread is "different situations call for different setups."

Saying "I won't work with PHP" is like saying "Gluten will make me shit my brains out." It is not virtue signalling.

rapsey · on Sept 4, 2020

Or people could do some quick research what options there are. Not everything needs to be spoon fed.

fock · on Sept 4, 2020

basically most people don't do that for saving costs, but just because it makes sense to them. And there's lots of tutorials explaining how to run your own webapp.

creshal · on Sept 4, 2020

4+k daily visitors on one of the biggest fan sites for a popular mobile game, and I'm still running it with sqlite as DB backend and no caching on a shared VPS.

mjburgess · on Sept 4, 2020

The additional expense in OP had nothing to do with scale; it arises from: redundancy, analytics, off-the-shelf integrations.

It is a common misunderstanding amongst software engineers that infrastructure serves "performance". Most of the complexity comes from redundancy and analytics (, realtime especially).

lomereiter · on Sept 4, 2020

Right, but how much do these nice-to-haves add to the revenue? Is it truly worth it to bake them all in from the very beginning? E.g. five minutes of downtime in morning hours will affect only a handful of users, and if so, why bother with blue-green deployments other than out of professional interest?

Andrew_nenakhov · on Sept 4, 2020

This. Before opening a link, I thought to myself, "$5/month ?", because that's what our Django websites with such load cost.

caspii · on Sept 4, 2020

I'v added this paragraph to the article:

the servers are oversized for the load we're currently seeing. The reason for that is that we tried to solve a production issue by increasing the server specs. It didn't solve the problem, and now we can't down-size the servers without re-provisioning them ️.

pastage · on Sept 4, 2020

I have found the same with digital ocean it is hard to down size but that is only true because I have so much unicorn data, and a move has become very painfull, so I'm stuck paying for 2x the storage cost.

You should really learn to add new servers by provisioning them and remove the old servers. Your app seems a perfect fit for it for moving between servers.

mkesper · on Sept 4, 2020

But as they are redundant this should be a no-brainer?

IanCal · on Sept 4, 2020

What's the problem with reprovisioning? Don't you have two that you're switching between?

bdcravens · on Sept 4, 2020

User count matters less than the amount of data and the amount of processing power required per user. Web servers are extremely efficient, so if most of that occurs within the context of requests, it's easy.

CryoLogic · on Sept 4, 2020

Depends on the type of web app. https://anim8.io - a site for hobby animators could never run on a $10VPS.

The bandwidth usage is just way too high and video compression requires GPU compute.

dylz · on Sept 4, 2020

The videos on that site aren't hosted inhouse or first-party. Definitely doable.

smt88 · on Sept 4, 2020

Dev time is more expensive than server time.

If my dev uses a slow stack or doesn't optimize v1, maybe I spent $100/mo instead of $10/mo.

Hell, let's say I'm spending $500/mo.

My mixed rate is ~$75/hr for dev time, so a week of dev time is equivalent to 6 months of hosting.

If optimizing (or using a difficult stack) takes one dev an extra week, then I'd better save $3,000 of hosting.

young_unixer · on Sept 4, 2020

What's your stack?

pembrook · on Sept 4, 2020

I don’t understand some of the comments quibbling over how this person has a “bloated” stack because it costs more than whatever irrationally frugal setup they’ve created for an app that does X specific thing.

Here’s a bigger point:

If you’ve got 55,000 monthly active users, and $171 per month doesn’t feel like a rounding error to you, then what you made isn’t actually valuable.

Spending any amount of hours to reduce that already small number is a giant waste of time for anybody who has that level of traction.

mattmanser · on Sept 4, 2020

This is the opening line of the article:

How much does running a webapp in production actually cost? Maybe more than you think.

So they seem to be trying to "educate" people, when they are wrong and basically giving away a lot of money to Amazon and DO.

And that's why everyone's chiming in with "lots of unnecessary YAGNI crap you got running there!".

benhurmarcel · on Sept 4, 2020

You're assuming that the goal is to make something valuable.

There are forums with more traffic than this, which don't make money. They're hobby projects.

golergka · on Sept 4, 2020

Making something valuable is not the same as making money. May be you just want to feel that you're doing something good for the community.

However, money is the universal way to measure value. And if you think that you create an enormous value for the community, and it's something that is important to you personally, paying a few hundred bucks a month for it should be a no-brainer.

ClumsyPilot · on Sept 4, 2020

You mean like how repairing roads should be a no brainer, but no-one wants to pay for it?

golergka · on Sept 4, 2020

Even here in Russia, which is famous for its awful roads, there's plenty of private roads, both closed to public, and operated for-profit with toll booths that are in great condition. Tragedy of the commons doesn't happen when things actually belong to someone.

ClumsyPilot · on Sept 4, 2020

If I had a penny for every time people write blanket statements of this nature, id be a wealthy man.

The list of privatised institutions that perform worse than their publically owned equivalents nations is very long.

Passenger rail in US and UK is privately owned and terrible compared to French and Spanish. US broadband is quite poor. US healthcare is much worse value. Private prisons - I dont even know what they were thinking.

One must carefully access if market-based solution for a given problem is suitable, or if the potential for abuse and natural monopoly just too great.

valuearb · on Sept 4, 2020

Amtrak is owned by the US government.

US broadband is run by local monopolies established by government action.

US Healthcare is heavily regulated, and grew massively expensive only after regulations were added, and preconditions were excluded.

All US prisons have problems.

You have any real examples?

ClumsyPilot · on Sept 4, 2020

> Amtrak is owned by the US government.

Irrelevant. The actual railroads are private, and Amtrak gets shafted. Look at the state of US passenger railways, and compare them to trains in France or Spain. All countries with successful passenger rail have them public.

You will find find same pattern for all examples given.

valuearb · on Sept 5, 2020

So your only example of the ills of privatization, is a company that wasn’t ever really privatized?

ClumsyPilot · on Sept 5, 2020

I can't help you if you intentionally misconstrue everything I write.

If you wish to continue believing that privatisation can never have negative results, please do. But even a casual reading of history or economics will demonstrate that's wrong.

valuearb · on Sept 5, 2020

I’ve done far more than a casual reading of economics, and since you can’t provide a single example of the ills of privatization I suspect your reading barely reached casual levels.

ClumsyPilot · on Sept 6, 2020

If you are going to taunt me, put in some effort and bring something to the table. Present your point of view and contribute to the conversation instead of just criticising and nitpicking.

You are just asking me to google things for you.

With your great knowledge of economics I am sure you can summarise the history of privatisation far better than I ever could, so please share your wisdom with us.

valuearb · on Sept 6, 2020

You made your claims, I clearly debunked them. I decline to spend my time to continue your free education any further, unless you put the effort in.

ClumsyPilot · on Sept 6, 2020

You demonstrate hypocrisy and blatant disrespect for your interlocutor.

Your 'debubking' is low-effort critisism. You've contributed nothing factual or informative. You don't even bother making claims, or aswering any of my direct questions.

valuearb · on Sept 6, 2020

You claimed Amtrak is an example of the ills of privatization, when it’s owned by the federal government. And attributed the effects of strong government regulation in the healthcare and broadband markets to “privatization”.

My criticism is only low effort because you know so little of the subject that your world view is built on obvious misinformation.

ClumsyPilot · on Sept 7, 2020

Your critisism is so low effort you don't even bother reading - I never mentioned Amtrak. You did not bother to mention structural differences between railway networks in France and Uk, to give any example of succesfull private passenger rail, to deal with my claim that all sucessfull mass transit is public, to comment on the fact that Uk privatised national rail but they went bancrupt and the public had to pick up the tab. That despite their 'private' nature we still provide subsidies and build infrastructure.

Maybe you should be spend less time judging worldview of others and more contributing something of value to the discussion.

valuearb · on Sept 7, 2020

Amtrak is the only significant passenger train service in the US. And you didn’t mention that air travel is far faster and as cheap as rail travel, which makes a huge difference in a large country like the US, over dense little city states like Europe. Air travel is replacing train travel even in Europe and has for over 70 years.

And privatization isn’t a guarantee against bankruptcy, which is a useful part of the free market. The UK rail was bankrupt as a public entity, forcing taxpayers to pay the bill. privatization successfully increased ridership and customer service. And it’s still privatized, Railtrak was only one of the group of companies created from privatization.

So again, do you have any real example that shows privatization doesn’t work?

ClumsyPilot · on Sept 8, 2020

You have gone from claim that Amtrak is public, then private, then back and forth again. You seem to think this half-assed service proves any point you want to make whenever convenient.

"you didn’t mention that air travel is far faster"? - Ow Really?

You've done nothing to address the question I've asked three times - show me privately owned railway or mass transit that performs as well, as the rail network in France or Spain. I am done here, please troll elsewhere.

srtjstjsj · on Sept 5, 2020

People do volunteer time and materials for guerilla road repair.

ClumsyPilot · on Sept 4, 2020

> If you’ve got 55,000 monthly active users, and $171 per month doesn’t feel like a rounding error to you, then what you made isn’t actually valuable.

You heard it boys, unplug those DNS servers, delete 4chan, teardown XKCD, they are worthless!

nickjj · on Sept 4, 2020

I'm confused about their thought process on their $95 a month DO servers.

They mention using a blue-green deployment strategy and a managed database service. That implies their web servers are stateless, or at least stateless enough to switch between the servers seamlessly.

But then they go on to say they don't want to downgrade because it means provisioning new servers.

Even in the worst case scenario of not using configuration management tools, aren't we talking a few hours of work here to save yourself let's say $40 a month which is 25% of their monthly bill? That would be the cost savings of downgrading to a lesser server x 2.

With configuration management tools, spinning up a new server would typically be running 1 or 2 commands and waiting 5 to 10 minutes. That's about how long it takes me to spin up a new server on DO to run a Flask application using Ansible and most of that time is because Ansible isn't exactly well known for its speed to execute tasks.

mewpmewp2 · on Sept 4, 2020

Or you can spend that few hours writing a blog post for marketing purposes or figure out how to actually generate revenue.

srtjstjsj · on Sept 5, 2020

The blog post earned them a lot of ridicule for poor technical management, and didn't promote their business.

avolcano · on Sept 4, 2020

Nice writeup! Always appreciate transparency with these sorts of things, as someone who writes a lot of tiny personal projects and sometimes wonders how much I'd have to invest if they ever actually gained any traction. Congrats on having so many users, and good luck on monetization.

For the server instances, I can't imagine 4 CPUs/8 gigs RAM is really needed 24/7, but can imagine it being needed in bursts, and I assume GCP has various "elastic" auto-scaling products for that. Would love to know if your metrics indicate that this is the smallest box you could comfortably run on, or if it's just a best-guess. Could potentially spin down blue-green environments after some set amount of time (once you know a deploy "succeeded"), as well.

For Metabase, I can't find any "system requirements," but I'm not _super_ shocked by the price tag. I've always liked the idea of hosting my own tools instead of relying on some SaaS's free/trial tiers (particularly for things like analytics, logging, or metrics), but a lot of the open source options out there assume you have a pretty hefty box to throw it at. At the same time, I haven't found any better alternative other than just doing ad-hoc analyzing in SQL (using a GUI like Postico).

erikrothoff · on Sept 4, 2020

We run Metabase on a 2GB VM on DigitalOcean, costing $15/month, and using the DO's Postgres cheapest plan (still pretty expensive imo). It's a JVM app so using quite a bunch of memory, 60% atm. So total cost of Metabase is $30 per month. We could definitely just run Postgres on the same VM to get the cost down.

As for web servers, you're totally right. At feeder.co we serve 25 requests / second (on Rails) per VM (we use 4 web servers and one loadbalancer) and we're still database constrained. They are $20 per month and the 4GB 2VCPU "basic" plan.

Hardware is insanely fast nowadays, and it feels like we forget to realise it sometimes.

bizzleDawg · on Sept 4, 2020

For a relatively small scale project (5K users) I am running metabase using my production postgres* on a free heroku dyno (Metabase even have a 1-click deploy for this). Takes about 30 seconds to wake up, but the price is unbeatable. I don't get any of the scheduled features ofc. Would recommend for keeping costs low.

*using a heroku standard-0 db, so $50/mo with relatively good performance

csunbird · on Sept 4, 2020

I also find $15/mo to be quite expensive as minimal configuration

caspii · on Sept 4, 2020

Thanks for the compliments!

Yes, the server instances are definitely too large. I was previously running on 2 instances that were half the size and ran into some problems which I thought I could fix by increasing specs. Turns out the problems were not related to specs and now I'm stuck with the instance size (on DigitalOcean you can only up-size not down-size)

I use Postico too, but don't have enough SQL chops to get the answers I need.

dan1234 · on Sept 4, 2020

You’re already using blue/green for deployments, so why not:

* replace the currently inactive server with a smaller instance

* swap to the new instance and test the load

* replace the second server

For such a small app I’d drop the blue/green (most deploys will likely only take a few seconds?) and host Postgres on the same server.

Also, I’d move metabase onto a DO instance.

Any reason you’re using DNSSimple over a cheaper provider (or free in the case of DO)?

gravitas · on Sept 4, 2020

These are all my suggestions, I would also add look for ways to spread out tasks to other smaller VMs - the general pricing is +$5 "per" ($5 per +1 vCPU, $5 per +2G vRAM), a $5/$10 webserver only handle static asset delivery, then apps on middle tier app servers at $15/$20 would give better fault tolerance and spread out the load and still keep blue/green.

grey-area · on Sept 4, 2020

You can downsize, just start at a small size and don't increase storage (choose ram and cpu only when upgrading), then you can resize smaller if you want.

Getting to a place where you don't care as you can rotate in new instances at whatever spec you want is even better though.

galacticdessert · on Sept 4, 2020

Hey, good job on your project!

The only thing I wanted to say is that the time spent learning a bit of SQL pays off massively. Perhaps something for your todo list :)

caspii · on Sept 4, 2020

Thanks. It's on my todo list ;-)

justinclift · on Sept 4, 2020

As a data point, we're using Redash (https://github.com/getredash/redash/), an alternative to Metabase, in a VM on Digital Ocean. It's a US$15 VM (2 cpu, 2GB ram) and seems to be fine. That's using a PostgreSQL instance running on the same VM, and nothing seems to be unhappy.

The graphs it generates are used both publicly (auto-updating):

https://sqlitebrowser.org/stats/

... and we have a bunch more private graphs and dashboards for metrics.

Everything is close to instant in responsiveness, apart from the "public" stats above. Those take some time to display purely on the browser side, as they feed way too much data to a browser for easy rendering. (will be fixed at some future point). ;)

Probably the only down side to Redash is a need to understand SQL. That can start out pretty simply though. :)

brendawalsh · on Sept 4, 2020

I am a big fan of redash.

It is really great in situations where non-SQL people ask if you can run a query or report for them.

It is amazing how quickly you can build up instrumentation with it.

tmpz22 · on Sept 4, 2020

Unlike AWS GCP has fewer options and I would argue nothing would fit out of the box. App engine is a PaaS that has various limitations and constraints that can take significant amounts of dev time and customer support time to get through. Spot instances require ops work, as would something like GKE combined with horizontal autoscaling via kubernetes. Their best bet would be to identify hot paths in the code base and move them to cloud functions at the cost of latency then scaling down the servers.

Often in cloud computing it’s dev time, ops time, cost savings, choose one.

ZephyrBlu · on Sept 4, 2020

> App engine is a PaaS that has various limitations and constraints that can take significant amounts of dev time and customer support time to get through

What limitations/constraints of App Engine have you run into?

pbreit · on Sept 4, 2020

Seems like DigitalOcean bill should be $10-20/month max.

Metabase is cool but you could probably use Mode Analytics free tier.

manigandham · on Sept 4, 2020

Try https://redash.io/

listenallyall · on Sept 4, 2020

It's disappointing (and expensive) that you have no caching strategy. I would bet that 99% of your scoreboards are updated very infrequently, these pages could be re-generated into static html pages via cron job every 30 minutes / 2 hours / once per day, whatever frequency you choose. It's a free service, make people pay to have real-time updates.

My gosh, you are polling each leaderboard for changes every 15 seconds. Again, I would bet perhaps 1 in 1000 mature boards (i.e. online for more than a day) change in any 15 second period. You could severely reduce your server load by utilizing a CDN and statically-generating any board which hasn't changed in say, the past 6 hours.

Slartie · on Sept 4, 2020

Back in 2009-2013 or so I ran a quite popular add-on for World of Warcraft which aggregated data from about 200k active users regularly uploading it, maybe 5GB per day, into a common database of 5-10MB that was then downloaded by about 1 million users in varying intervals back into the game (it was some sort of mapping database, telling you where in the world which NPC was, with that info gathered by users using the tool). It came to eat several terabytes of monthly traffic, which was huge back in those days, but I operated that infrastructure by myself for several years on a single dedicated server for about 50€ responsible for the data aggregation work (heavy on I/O and computation) and two virtual server instances for about 15€ each which basically only served the downloads. The included traffic package on those VPSes was large, actually I chose the VPS offering precisely for that reason because I could economically cover all the download traffic that way. The compute power on them sucked, but they were basically a simple CDN setup so that didn't matter much. The total was about 70€, off course domain etc. need to be added, but I never paid more than 100€ for hosting of that app.

It is really disconcerting to see that doing similar "do-it-yourself" hosting activities nowadays appear to cost twice as much (judging by the article) while the "raw msterials" - core backbone traffic, compute power, memory and storage - all have gotten waaaaay cheaper in the last decade. Too many new rent-seeking middlemen, I guess...

senko · on Sept 4, 2020

> Note: the servers are oversized for the load we’re currently seeing. The reason for that is that we tried to solve a production issue by increasing the server specs. It didn’t solve the problem, and now we can’t down-size the servers without re-provisioning them.

Too late for them, but if you find yourself similarly needing to increase the capacity to solve a produciton issue on DO: you can upgrade CPU/RAM without increasing disk size. This allows you to downgrade later if you end up over-provisioning.

nbadg · on Sept 4, 2020

> The webapp runs on two identical DigitalOcean servers... The database is a hosted Postgres instance also on DigitalOcean.

They're only spending $69/mo total at DigitalOcean, and it sounds like their scaling isn't limited by computational power but rather by operational concerns. So, I'm not entirely sure what the backend language has to do with the costs here.

If you're trying to draw a cost comparison between languages, then the take-home from this writeup is that 55k monthly users is small enough that the backend language doesn't make a difference in costs.

Side note, this kind of transparency is always really nice to see!

moksly · on Sept 4, 2020

I think this article is frankly PR to get their users to generate revenue. It has nothing to do with the cost of hosting a python webapp.

I mean, why are you even doing analytics if you’re not generating revenue?

ramraj07 · on Sept 4, 2020

This is not clear at all - 55k monthly users means nothing, what's the average number of requests per second? 99th percentile? What's the same stats for CPU and memory usage? 3.4k users per day is actually not that large a number, even assuming each user is going to bombard your server with thousands of requests in 20 minutes, it's well within what optimised flask with wsgi can handle with a very mediocre compute instance.

Further,Are you absolutely sure that you have optimised your configuration and code for performance without changing the stack? (Serving static components directly via ngninx, caching responses using flask, caching other resources using memcached, optimizing gunicorn worker counts for the instance type, profiling which endpoints take the most amount of time and trying to optimize them, see if the bottleneck is webserver or the db, considering pypy, and probably 50 other ideas).

Are you also absolutely sure that bluegreen deployments are what you want to use given your cost constraint? Further, if you have bluegreen it should be straightforward to downgrade your instances to smaller ones fairly easily,why not do that first before discussing costs?

We run production webservers on t2.small ec2 instances in elastic beanstalk and (with some assumptions) handle comparable loads during work hours. We have two redundant instances and can easily scale up or down even on schedule with zero extra code. There's not even a need to minimize costs here because our application costs 100x more on data lake costs than the webservers, but it really helps keep the engineers grounded and ensuring they don't write outrageously inefficient code that's covered up by excessive server sizes.

christilut · on Sept 4, 2020

The author says he wants to switch Disqus to Commento but I wouldn't just do that. I didn't have a great experience with Commento. I like the idea and mostly the execution of it (including the OAuth) but I found some major bugs (Firefox just didn't work at the time) and I didn't get any response to reporting those. I asked a few more questions by mail and also no response. After that I'm gone pretty quick.

Not that I'm saying you should use Disqus since they have some privacy issues

mtlynch · on Sept 4, 2020

Ditto.

I was excited for Commento and signed up for a paid plan. I even contributed code.[0] The author sent me an onboarding email which included the line, "please reply to this email (I'm a real person)." I replied with some questions about the service and billing, but he never responded. Four months later, nobody has responded to my merge request, either.

I canceled my service at the end of the month.

[0] https://gitlab.com/commento/docs/-/merge_requests/41

aikah · on Sept 4, 2020

Nice, we need more of these "real world expenses" posts on HN. I wonder how much it would cost to run the same app powered with Java. More? or Less?

jillesvangurp · on Sept 4, 2020

We use Google cloud run for this. They bill per request and you can run whatever docker container you want. I currently have a small setup with two separate kotlin based server applications running there (one based on spring boot and one on ktor). We use firestore as a cheap datbase (billed per read & write). Both have a free tier. So it's costing us next to nothing currently because neither docker container is getting a lot of traffic. Basically the bill for last month was only around 2$ for one of these applications (the other one I only deployed a few days ago).

Basically, at some point it gets expensive of course but when you are simply testing out stuff and don't expect a lot of users to show up, it's actually not bad.

asadawadia · on Sept 4, 2020

how does the kotlin server perform on cloud run? is the startup time slow? does spring take a lot of memeory etc?

jillesvangurp · on Sept 7, 2020

I bumped the memory limit from 256MB to 1GB. 2GB is the maximum on cloudrun. You pay a combination of CPU/Memory per request obviously. Beyond that, it scales by firing up more containers as your request load increases to a default maximum of 1000 (we set it at 2).

Obviously don't use this if you have a lot of cpu/memory requirements but otherwise this should be fine for a well crafted stateless server.

Startup is fine with both ktor and spring-boot but obviously takes a bit of time. I noticed there's startup overhead if your container gets shut down because there is no traffic. In that case, it takes around 30 seconds for the first request to go through. That includes everything from booting the container and then starting the process. I imagine simply pinging the server regularly should keep it running. Alternatively, you can configure a minimum amount of servers.

threeseed · on Sept 4, 2020

As with every language it depends on the framework.

Based on the Techempower benchmarks: https://www.techempower.com/benchmarks/

You can find Java frameworks that are significantly than any of the Python ones. Or you can find ones that are significantly slower.

isatty · on Sept 4, 2020

$30 dedicated from Hetzner can do this reliably enough for a non revenue generating blog, including blue green deploys, postgres, certs, cache, edge caches (Cloudflare I guess) and even Metabase (which admittedly can be a memory hog if you don't set it in the JVM).

kjaftaedi · on Sept 4, 2020

I put some VMs in Hetzner for a project I was working on, and found out you get what you pay for.

They mistakenly deleted all of my virtual machines and all of my backups.

If you host anything with them, or anyone I guess for that matter, make sure you have your backups hosted somewhere else.

thatwasunusual · on Sept 4, 2020

Did they ever give you an explanation? I've used Hetzner for 6-7-ish years, and _never_ had any problems with them. I have mostly used their dedicated servers service, though.

kjaftaedi · on Sept 4, 2020

Their security team was concerned that I was impersonating myself.

They sent me a boilerplate e-mail that there was a problem with my account and to contact them to resolve it. I called them within 10 minutes, and they told me 'your account doesn't exist'

They restored my 'account' which means i had a user with the same login and password, but all of the assets to the account were tossed to the wind.

I can only guess that someone jumped the gun, (assuming they had an internal process at all).

isatty · on Sept 4, 2020

I don't have experience on their VM's and have only used the dedicates to host personal projects/and non critical infrastructure.

However, you should be treating your VMs as disposable and not as pets.

kjaftaedi · on Sept 4, 2020

Imagine running a database in the cloud and someone deletes all of the database nodes in the cluster, all of your data, and your backups of the data.

VMs can be disposable, but if you have data you want to keep, you should have a second copy hosted by a separate organization .. well that was my lesson anyway.

dddw · on Sept 4, 2020

Thanks for that warning. Was planning to migrate there. Do you still host there and if so, where do you backup to?

kjaftaedi · on Sept 4, 2020

They were rather standoffish in their communication despite me attempting to be polite as possible to try and get my stuff back, so I did not want to continue with them.

I'm mostly just using digital ocean these days. Their API is fantastic.

These days I do a better job of keeping code in github even if I'm the only user, and backups can go anywhere as long as you're not relying on the same provider.

achairapart · on Sept 4, 2020

I'm not familiar with Python as a web stack but to me it looks that the DO instances are a bit over-scaled for 55k monthly users.

But what caught my eye was DNS Hosting: 5$/Month? Maybe it's a typo, shouldn't be more like 10$ per year to host a domain and its DNS?

Anyway, thanks for sharing, very interesting.

stevekemp · on Sept 4, 2020

I charge £1/zone/month for DNS-hosting, and while most of my clients only have a single zone there are people with 50+

DNS hosting can be cheap (free if you use cloudflare, etc), or very expensive. My prices give me a 50% margin over the raw AWS Route53 costs, but the appeal is that I store records under revision control and let you make changes using git:

https://dns-api.com/

achairapart · on Sept 4, 2020

DNS over Git, it's a very interesting concept. Thanks for sharing.

brendawalsh · on Sept 4, 2020

Cool concept! I will check it out.

ing33k · on Sept 4, 2020

interesting .. is the price flat even If there are lookups in billions ?