Hacker News new | past | comments | ask | show | jobs | submit login

If you can't hash join then you can't join over large datasets anyway, so sharding costs you nothing in that respect.

Right now, using off the shelf kit and doing nothing particularly clever, running a major commercial RDBMS you could do 10,000 commits/sec and handle 100T of data on a single instance. Sure it would cost you a pretty penny, but the thing is, unless running a database is the one competitive advantage your company has, you're better off keeping your people focussed on the thing that does make you money.




I'm not too sure if you are arguing for or against the commercial RDBMS.

To me, that sounds like a pretty good argument against it - the great thing about MySQL/Postgres+Sharding is that it scales down, as well as up. You can start out with a single server, then gradually add in extra servers as you need them. With the "single big DB server model" it doesn't work like that - you have to make a pretty decent investment early on in the software licence, build your software to use it, and then the pricing isn't linear, either.

Plus, the backup/redundancy thing sucks too - with sharded MySQL you need a couple of spare cheap servers, but with Oracle etc you need to pay twice for the licence and a second server.

OTOH, in a corporate environment it's much easier to predict your usage and the pricing is easier to justify. Also your priority is safety+justifiability first rather than price or even price/performance.


Which commercial database? Any idea which web scale applications are using it, and why the other ones aren't?



Yes, and it's partitioned out the Wazzooo.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: