Hacker News new | past | comments | ask | show | jobs | submit login

Why don't you get rid of the fault-tolerance part in a filesystem system (remove recovery, remove concurrency control). You are not too far away from interacting directly with harddrives' firmwares.

SQL (and RDBMS) is good for three things:

- The relational model can model the nature of our data well.

- Transaction control, recovery, etc are notoriously hard to implement. Most people (I bet 99.99% of the programmers) would fail to design these systems. RDBMS takes care of these for you.

- Physical data independence: The declarative nature of SQL enables you to specify what data you want, rather than how you want them. The optimizer picks a logical and physical execution plan (that doesn't suck) for you. The same query from the 70s running on IBM mainframes can still be running today on say 1000 Amazon EC2 nodes, although the underlying computer and storage architecture is poles apart. (There are political reasons why this is not as true as it should be, but that's another story.)

Now of course, the features are double-edged swords. Some engineers dislike RDBMSes because they don't understand what is going on in the system. If you are Google, you have the engineering power and the business needs to design systems that precisely tailor to the nature of your data.




Well, there's 2 questions here (posting again to give whatever horribly lacking-in-perspective fanboys are out there another chance to downmod a discussion of frickin storage layers):

1) Are RDBMS useful at all, ever? Sure. If I don't have an extreme scaling need, I'm the first one to say throw MySQL on that box and call it a day, it's tested, it works, gives you a bunch of functionality for 5 minutes of installation, developers are used to it, easy.

2) Should you be putting application logic in the database? I argue no. I understand that this isn't unanimous, but you're sacrificing long term flexibility and scalability for some short-term convienence that isn't even that convenient. I mean, PL/SQL sucks. I'm not saying you shouldn't ever do a join in the database, or shouldn't ever normalize your data, but if you're writing a lot of PL/SQL code, it's probably gonna come back to bite you if you ever need to scale or change architectures.

RE: that query from the 70s -- that query from the 70s that will actually run across different implementations contains exactly the subset of SQL that I consider acceptable and worth using (minus limit and a couple other small fry, I guess). If it's more complicated than that, it should be app code IMO.


If you write your code in Java (or Ruby or Python or C++) it will come back and byte if you even need to scale or change architectures.

Actually, if you have to change architectures, you are probably bitten anyway.

If you need performance: 1. Do as much as you can in one SQL statement (unless you are using MySQL where the optimizer sucks) 2. If you can't do it in one SQL statement, use PL/SQL to do data processing as close as possible to the data - this will save on network load and PL/SQL is highly optimized to reduce overhead for embedded SQL statements.

With all the talk about "horizontal scalability" we sometimes forget the benefits of writing fast code in first place.


That network load canard is a little dated. Gigabit ethernet gets you 125MB/s, a couple bonded NICs will get you 250MB/s. That's almost definitely way more bandwidth than your database can push by doing a bunch of random reads (500 IOPS/sec * 4kb = not very much), unless you've got a really expensive SAN (in which case buy a $500 10GBE card for you DB machine), so reducing network load from the DB is sort of a nonfactor. Reducing round trips still makes sense, but if your DB is CPU and/or disk saturated, it doesn't actually save you anything.

Performance really comes down to what work are you doing, where does it take place, and how many disks and CPUs can you get involved to spread the load. For small amounts of load, you could say there's some gain from pushing more work in the database but it doesn't matter anyways. Large load, you're making the DB a bottleneck.


With logic in the apps, every time you change the business rules, you have to somehow stop everyone from running any app that has the old logic until you have ported the new logic to whatever language it's in. Logic in the database always works and can't be accidentally (or deliberately) bypassed.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: