Hacker News new | past | comments | ask | show | jobs | submit login

- availability means access to the data, not just uptime

- I have never seen changing schemas on the fly in a RDBMS programmatically (except admin functions). I have seen overloading of column types however.

- "super column" data stores have a deistinct design advantage that you can not get from relational DBS.

- everyone that is using relational does the same thing when they get big data: denormalize. From materilzed views al the way to sharding and replication.

- banks use transactions, but not as a two phase commit. it is a single atomic record - not partial updates.

- we still have some species from the age of the dinosaurs, but the land is rules by those that have evolved.

- I agree that banks should be consistent. Not sure what he author was saying here

- CAP is relevant to ALL datastores. Consistancy, Availability, Partition tolerance : pick any two.

- if your post does not get committed to Facebook, it may be important to YOU, but the application is designed more for availability




Of course you can change schemas on the fly. Just because ALTER TABLE goes ka-thunk on MySQL doesn't mean it can't be done in more robust databases.

I'm not sure what you think two-phase commits are, but they ARE atomic transactions. They are essentially "Everyone cool with this transaction?" "Yep!" "GO."

The CAP gotcha is that you already have partition tolerance - if you didn't, your database would go "boom" when a node went offline. So, really, it's "pick one".

And believe it or not, Facebook moved some stuff from Cassandra to HBase in part because of its stronger consistency model.


You can change schemas on the fly. But no one does it in their application except part of Admin.

Two-phased commits are when you write a partial record to the DB and then cleanup the transaction. When they complete succesfully, they are a complete transaction, but during a failure they are not atomic through the database because it takes extra application logic to cleanup.

Partition tolerance is not when one node goes offline, it is when critical nodes (master) or up to n/2 fail.

For CAP, you can optimize for just Consistancy, Availabilty, or Partition tolerance. But you can also pick two - you just can not do all three.

I did not know Facebook went to Hbase and why it do so - can you provide a link?


It looks like different people use different definitions of partition tolerance; the original meaning was:

"The network will be allowed to lose arbitrarily many messages sent from one node to another."

Which means that, if you have a network, you have partition tolerance, period.

But Stonebraker defines it differently:

"If there is a network failure that splits the processing nodes into two groups that cannot talk to each other, then the goal would be to allow processing to continue in both subgroups."

Great article here:

http://www.cloudera.com/blog/2010/04/cap-confusion-problems-...

As for two-phase commits, I'm not sure what system you've used that requires app knowledge, but (for instance) PostgreSQL does them automagically; you just do PREPARE TRANSACTION and COMMIT PREPARED, and if the transaction fails you need app-level error handling - but presumably you have that error handling even without two-phase commits.

Here's an article on Facebook's move to HBase:

http://www.facebook.com/note.php?note_id=454991608919




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: