Even among startups, web scale data requirements are the exception, not the rule. Facebook and Google are ginormous. There are many, many very impressive applications whose database wouldn't tax a single commodity server. (Similarly, there are applications that make terrible businesses but which consume computing resources like losing a byte of information would doom humanity.)
I mean, go through a list of YC companies or other startups you respect, winnow it down to the ones that exited or otherwise achieved some level of success, and play guess-the-size. How many terabytes of storage do you think e.g. Airbnb needs?
There are a ton of high-traffic websites out there that don't need an architecture any more complex than a standalone DB server + PHP + varnish (or the equivalent).
More so, if devs spent as much time tuning the performance of their apps as they did fantasizing about "web scale" architectural pivots they would typically be farther ahead. StackOverflow.com is a perfect example of this. They run on tiny handful of windows machines, support gobs and gobs of traffic, and have absolutely fantastic performance. And as much of that is due to paying attention to performance and making sure to find and remove the bottlenecks where they exist as it is to using cutting-edge architectures like database sharding, map+reduce, eventual consistency models, etc.
One thing I've found is that scaling rarely means solving difficult problems. Rather, it means putting more time into finding optimal solutions to problems that are trivial at smaller scale. For example, should your startup use Apache, nginx, or HAProxy as a load balancer? If you're just launching, the answer is "Who cares, just ship the fucking thing!". If you reach the point where you start measuring page views in the billions (and yes there are start ups that are at this point), it matters a great deal. Or should you use Postgres, MySQL, or some shiny NoSQL thing? Again, probably doesn't matter for small websites. But for larger services, it matters.
Also, don't underestimate how large log files can grow in a data-driven business (like AirBNB seems to be). I could easily believe that they have many terabytes of data just from logging actions their customers have taken.
> Also, don't underestimate how large log files can grow in a data-driven business (like AirBNB seems to be). I could easily believe that they have many terabytes of data just from logging actions their customers have taken.
Logs don't have remotely the same access requirements as the databases used to serve a product.
Indeed, but it's worth pointing out that in this case "different" doesn't necessarily imply "easier". Instead of having to access the data across many concurrent connections, you have to be able to store the data efficiently so that it doesn't take up too much space and you can do jobs on them that don't take 3 weeks to complete. And let's not get into how you collect and merge them together. There are open source tools to do these things, but you're still looking at a decent amount of infrastructure to make it work.
Perhaps for the applications of yesterday (like Basecamp) this is the case, but the real innovation taking place is around collecting massive amounts of data and processing it in interesting ways. These systems are used every day to make quantified business decisions rather than best guessing based on someone's hunch. 37signals builds questionably good UIs on-top of a database, something people have been doing for decades now. The future is in augmenting intelligence by gathering massive amounts of data and reducing it for human consumption.
I don't disagree with anything you've stated. Explosive growth and requiring massive amounts of data storage are surely the exception not the rule.
That said, the blog post talks about enormous growth and it still fits inside Moore's Law's growth. I guess my gut is just saying it's not really that enormous in terms of startup scaling if it's still within those limits. Not to take anything away from 37Signal's success, but it feels like nothing of value was really added by this post. I present the post of a picture of 864GB of ram as supplementary evidence that is near the top of HN right now.
I think most "other startups" would LOVE to be in the ballpark with BaseCamp in terms of usage. No BaseCamp is not big like Facebook, but you've heard of them right? Most "ordinary" business people who do project planning work have probably heard of BaseCamp too. Most people never hear about most startups, fewer try their services, fewer than that become regular users.
The point of the post is that in many/most cases, it's still easier and cheaper to throw hardware at a performance problem than to devote scarce engineering effort to optimization. And it's only getting more and more so. If you are a startup and you can throw $10,000 of hardware at a problem you can then keep your $100K engineers working on things that hardware alone can't solve.
Most "ordinary" business people who do project planning work have probably heard of BaseCamp too.
I think this might be the tech bubble showing. I've never heard of any non-tech, non-startup companies using Basecamp. I'm not saying that they don't, of course, but I'd be interested to hear some case studies in its use outside of tech-savvy crowds.
Last I checked, I believe something like 20% of our customers were from the tech/startup scene. The vast majority of our customers are regular businesses.
But yes, we're still tiny compared to behemoths like Sharepoint. All the more reason to be excited about the next 20 years!
And there was a great press quote for you in the comments: "$50 is peanuts for what you get in Basecamp. Any business doing $1000 a month should find huge leverage from it."
Exactly. As horrible as it is, sharepoint rules this area and probably has an install base that dwarfs base camp. When the average corp person thinks project and file management they think sharepoint.
Most programmers probably work at either BigCo enterprises (banks, insurance companies, telcos etc) or at "startup" type tech businesses or freelance agencies.
There are other industries that contain a lot of small businesses and probably employ very few programmers, think restaurants , local shops , small law or accountancy practices etc.
These guys probably aren't using sharepoint, most of them are probably using excel spreadsheets combined with paper.
I'm not sure how many of these guys are using things like basecamp but they a lot of them probably should be.
I thought 37signals website has videos of "satisfied" users? and some of them don't seem to be "tech" and "startup". Probably small businesses, but I don't think we can lump all "small businesses" == "startups"
> Most "ordinary" business people who do project planning work have probably heard of BaseCamp too.
I don't think so. Maybe we just have a different opinion on ordinary, but I come across people all the time that would think Excel is the normal thing to use for this (and pretty much any other task...). Using web applications for day to day workflow is still alien to a lot of "ordinary" business types.
It could very well be a case of their growth is scaling in-line with Moore's law, so it just happens to work out well.