Jepsen: Testing Partition Tolerance of PostgreSQL, Redis, MongoDB, Riak (2013)

gregburd · on May 2, 2014

This is a much more recent presentation by Kyle (the author of the linked article) with a more mature version of his Jepsen tool. I imagine he'll get around to a written version of his findings soon, until then it's worth the time to watch this and learn a bit about distributed systems, databases and testing. https://www.youtube.com/watch?v=XiXZOF6dZuE

smcleod · on May 2, 2014

For me choosing a database involves carefully considering the value of data that will be housed within.

To what degree do I care about the data? How long would it take to recreate? What's the cost to my business if data is lost.

Time and time again, I end up choosing PostgreSQL for ANY data that I care about (i.e. anything that is not simply 'in-flight data'.

falcolas · on May 2, 2014

From the point of view of a "devops" developer - who has done hard time in QA and DBA as well - this requirement changes drastically when performance and scaling enter the equation. The scale slides quickly from "lose no data, ever" to "we can lose a few seconds of transactions if it speeds up the web page" to "we can lose a lot of data and still be OK, as long as we're still online".

Seconds, or even minutes worth of lost data do cost the company, but not nearly as much as poor performance. And unfortunately, developers tend to over-value data in the equation, leading to decisions which cause a company problems when it comes time to grow.

Ultimately, the best tools for ensuring business continuity (with few exceptions) is redundancy coupled with a set of proper backups.

olavgg · on May 2, 2014

Wouldn't message queues help you with this? They can be easily scaled horizontally to handle any write load and correctly insert data to the database safe over time.

You can also shard your database to distribute write load.

falcolas · on May 2, 2014

All of those absolutely help, at the cost of added complexity, additional points of failure, and hardware/VPS costs. And you still risk losing data, or at least data integrity.

Not to mention there are still hard limits on how quickly you can insert data into a database with 100% durability (which is, of course, impossible, but another topic entirely), and there are scales where even these mitigation tactics can't help you anymore (in particular, online casino games have this problem since they are persisting the state of multiple players very frequently).

smcleod · on May 2, 2014

Out of interest, What hardware have you tried running your Postgres clusters on?

stephenpiment · on May 2, 2014

FoundationDB did an extensive test with Jepsen, with results published here: https://foundationdb.com/blog/call-me-maybe-foundationdb-vs-...

rdtsc · on May 2, 2014

They are shipping a closed source black box and advertising their own demos and tests for a while. Just like this one. Not sure how much I would trust that more than just having a marketing statement saying "oh yeah we are the most stable, fastest, etc".

eis · on May 3, 2014

That and the quite low limits for transaction duration (5 seconds!) make FoundationDB a no-go for me.

Closed source + very narrow limits due to system design makes this a very hard sell.

It's a pitty because they do seem to have some decent ideas in there. I like the layers that build more complex data models upon a transactional distributed key-value store.

markbnj · on May 2, 2014

Great read! I think this is a first time I've seen a really good, structured failover test that contrasts postgre, redis, and mongo (I don't know much about Riak). It would have been interesting to see mysql in there as well.

dang · on May 2, 2014

This is a dupe of https://news.ycombinator.com/item?id=5912035.

coldcode · on May 2, 2014

This type of analysis really shows that dealing with failure is a highly complex problem with no easy answers.

ryanbrush · on May 2, 2014

Were any issues logged, or these problems otherwise addressed, in the ten+ months since this was posted?

I wasn't able to find anything with some quick searches but hopefully others on this thread are more familiar with these projects.

dkhenry · on May 2, 2014

This is an awesome article. I would love to see the tests packaged in such a way that we could port them to other DB's easily and perhaps do something like the web framework shootout, but with database consistency.

espeed · on May 2, 2014

It would be cool to see how Hyperdex (http://hyperdex.org) holds up.