I like this article: while it is for sure not the definitive guide to NoSQL, it is a short description mostly about facts that people new to the field can use to get an idea about what a good candidate could be for initial experimentation, given a defined problem to solve.
That said I think that picking the good database is something you can do only with a lot of work. Picking good technologies for your project is hard work, so there is to try one, and another and so forth, and even reconsidering after a few years (or months?) the state of the things again, given the evolution speed of the DB panorama in the recent years.
While I'm at it I like to share that in this exact days I'm working at a Redis disk back end. I've already a prototype working after a few days of full immersion (I like to use vacation time to work at completely new ideas for Redis).
The idea is that everything is stored on disk, in what is a plain key-value database (complex values are serialized when on disk), and the memory is instead used as an object cache.
It is like taking current Redis Virtual Memory and inverting the logic completely, the result is the same (working set in memory, the rest on disk), but this implementation means that there are no limits on the data you can put into a single instance, that you don't have slow restarts (data is not loaded on memory if not demanded), and there isn't to fork() to save. Keys marked as "dirty" (modified) are transfered to disk asynchronously as needed, by IO threads.
If everything will work as I expect (and initial tests are really encouraging) this means that Redis 2.4 will exit in a few months completely killing the current Virtual Memory implementation in favor of the new "two back ends" design, where you can select if you want to run an in-memory DB or an on-disk DB where memory is just an LRU cache for the working set.
antirez, it's a honor that you commented, thanks! :)
The new inverted logic for the VM you describe seems very interesting; I'm very much looking forward to see 2.4!
Redis is already more than perfect what we use it for -- keeping track of stock price data, and distributing it. The size of the DB is known in advance (the amount of stocks does not grow very fast), and the performance is perfect.
I think the main business of Redis is still as an in-memory DB / cache / messaging system and so forth. We have a decent implementation from this point of view, so the next logical step is making it working in a cluster.
On the other side it's really interesting to see what people can do with Redis data model if much larger datasets can be used without problems (at the cost of performances of course... can't be as fast as memory). VM was my first idea, but I need to admit, I don't like the design at this point. This new design can be much better, and we can have it production ready in a few months. So I'm curious about what will happen in 2011! :)
Much below Stolen from their overview page (All needs to be confirmed): http://hbase.apache.org/
WRITTEN IN: Java
MAIN POINT: Hadoop Database
LICENSE: Apache
PROTOCOL: A REST-ful Web service gateway
This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware.
HBase is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop. HBase includes:
Convenient base classes for backing Hadoop MapReduce jobs with HBase tables
Query predicate push down via server side scan and get filters
Optimizations for real time queries
A high performance Thrift gateway
A REST-ful Web service gateway that supports XML, Protobuf,
and binary data encoding options
Cascading, hive, and pig source and sink modules
Extensible jruby-based (JIRB) shell
Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX
HBase 0.20 has greatly improved on its predecessors:
No HBase single point of failure
Rolling restart for configuration changes and minor upgrades
Random access performance on par with open source relational databases such as MySQL
FOR EXAMPLE: Facebook Messaging Database
BEST USE: Use it when you need random, realtime read/write access to your Big Data.
I disagree with the notion of "No single point of failure" with HBase. While it's true that they got rid of the old hard SPOFs, if a data ("region") node fails, there's a decent chance you lost some data (there's a short period of time where data hasn't been replicated to HDFS and is ONLY only the master for that region) at least until you can bring that node back up (hope you've got reliable RAID/backups or it wasn't destroyed). HBase is a CP system.
There's also a region master re-election/recovery period that depends on the size of the database, network bandwidth, load, etc. It can be anywhere from 30 seconds to tens of minutes. An outage of a region node makes it's key range inaccessible. While that might not be a problem for some, especially in read-only situations, I can think of many applications where that would effectively translate into a total outage.
I'll question you on one point: "Use it when you need random, realtime read/write access to your Big Data."
Does HBase now do a good job of random access? I was always under the impression that it did random access adequately, but it's real strength was with scans (based on ordering of keys).
I did benchmarks on a previous version, and the results were pretty miserable (600ms/lookup on a table with several million columns). It certainly sounds like they've improved.
Thanks, Gordon, I put it in (shortened some lines).
It would be great to have a "more general" for-example, since noone outside Facebook meets the problem of "let's build Facebook's messaging database" :) Any suggestions?
Actually, this would be useful and not superficial to someone who has seen pictures of these fruits but has never seen one in real life. Obviously it can be refined, but the idea is not bad. For example, try this with some quite different but seemingly similar fruits that most people are not as familiar with:
Pineapple Guava (Acca Sellowania) -- Small green fruit. Seeds soft and edible, skin optional. Turpentine flavor signals overripe. Cold hardy and grown in many parts of the US as an evergreen ornamental. Delicious eaten raw.
Strawberry Guava (Psidium cattleianum) --- Tasty small soft red fruit with very fragrant aroma and many small hard seeds. Skin edible, but seeds best avoided. Can be eaten out of hand, but low commercial use. Frost tolerant in mild climates.
White Guava (Psidium guajava) --- True tropical guava, thus barely if at all frost-tolerant. Large fragrant fruit with inedible hard seeds. Usually used for juice or puree, rarely eaten out of hand. Wonderful strong aroma increases with ripeness.
While obviously not of use to a producer of guavas, this sort of cheat sheet might be helpful to someone who happens to encounter one of these varieties in a grocery store or tree nursery. At the least, it might keep someone from breaking their teeth on the inedible hard seeds!
I agree. And I think superficiality is different than generality or summary. It's a summary. And I appreciate the effort of some one taking the time to put their work online.
This article is mostly marketing phrases from the websites of the various projects. Sadly, much of it is inaccurate, extremely skewed, or otherwise not useful for the stated purpose of comparing the listed databases.
For example, CouchDB having a "Main Point" of "DB consistency" might be the case, as it is for Redis, when there is no replication. In replicated configurations, it is definitely not true. Further, the MVCC is weaker in many ways than in a Dynamo system like Riak as you have no way to influence or discover consistency between replicas.
I'm sure folks expert in other systems can identify similar errors in the rest of the post. Can someone explain to me who the target audience is for all these NoSQL comparison articles? They are universally poor, yet universally popular.
My understanding is that in CouchDB you can't guarantee that older versions of documents will still exists (they might be there, but they could have been removed by compaction or not replicated).
However, there is a fairly nice way of storing older versions of documents - hold older versions as file attachments on the document. See:
This is probably the biggest misunderstanding of couchdb, imo. The versioning system in couchdb is only there to make the seamless replication possible. There's no guarantee that previous versions will exist at a future time, like in git.
Where couchdb has some immense possibilities is in distributed applications, not only server side, but also mobile phones and browsers. Since you can write and contain an entire webapp inside of couchdb, you can technically replicate the entire app to your mobile phone, and it'd work offline or online. And if you need your app on another platform--as long as it has couchdb, you can just replicate it there.
I never see this mentioned in any overviews of comparison for couchdb.
The sticking point right now, though, is that couchdb isn't on very many mobile platforms. There has only been experiments with writing couchdb on top of HTML5's localStore, and jChris et al are working on Couchdb for android.
The "versioning" is really just there to support their optimistic concurrency model, if I recall. The idea is that you know you need to retry your operation if the version hash of the file has gone up since you last read the data and thus you know your local file is out of date.
As I recall, the id field is just a string. It's just common to let it do the automatic "#-hash" representation.
It's been a while since I played with CouchDB though, so I could be off.
I didn't mention that (it does not really matter when choosing one right now), but I too consider that one of the most exciting explorations in this area!
Thanks for the solid comparison, it's a breath of fresh air. We're working on the mobile platform SDKs right now, should be rolling them out all through 2011.
Trying to use CouchDB's versioning system seems like depending on a very leaky abstraction. That you can see it's versioning is a side-effect of how it behaves; not a feature it is providing.
If you need versioned records, you are likely better off identifying your versioning requirements and building to those than trying to piggyback off of something else poorly suited.
Yes, if versioning is the main point, then surely one should take special measures to keep them, as in the case of invoices (stornoed, stornos stornoed, etc). As for a CMS system, for example, it's nice to have most of the history, but nobody gets hurt if some gets lost during replication outages, etc.
With AppEngine at Google, MongoDB at Disqus, Cassandra at Facebook and Redis at Github you can definitely say that SQL databases are one of many options available today and don't dominate like they did 5 years ago.
if i'm not mistaken, the majority of those organizations still rely heavily on relational datastores, except in the case of exceptional workloads. in addition, i believe facebook has since migrated away from cassandra to the hadoop stack for their messaging platform, though they primarily use mysql (or its successors).
SQL is being replaced in niches that strain its model. elsewhere, it remains steadfast.
I agree, ultimately it comes to everyone's own definition of "tyranny". :) I meant it as "not really having any defensible other choice". (While, of course for example CDB and BerkleyDB have "always" existed.)
I think nobody expects SQL's "market share" to fall to low levels, especially with noSQL requiring much deeper understanding of the data and it's planned use. NoSQL practically operates on a lower layer than SQL does.
Still, it's nice to see people thinking about data storage choices and not going blindly to MySQL/Oracle/etc!
After using MongoDB with Mongoid I strongly disagree with the premise that over the next 5 years SQL databases will be the default and NoSql limited to certain niches. And I know SQL better than most. Alter table ... or db:migrate no more for me.
I know that Facebook created Cassandra, but do they use it for anything substantial? I read that Cassandra was created for the Facebook inbox, but I more recently read that Facebook is now using HBase for the messaging platform.
Yea, I knew the post would be vacuous when I hit "tyranny", but I guess the flourishes makes good HN bait to speak in sweeping terms like that. There's little about on-disk consistency and data loss risks, scale characteristics yet there's coverage of the wire protocol? Meh. Sure, I'm glad to have a lot of these tools available, the bad old days of rolling your stuff with Berkeley DB or NDBM seem safely behind us but the reality is that there are many classes of problems for which SQL is still and will remain the most sensible solution. Get over it.
I also don't understand the 15 years figure either, is that a reference to when MySQL was initially released? I hope the original poster understands that SQL is older than that.
Not that they get thrown away, they are still very useful. But at least there's an alternative now.
I mean, seriously, Zabbix keeping monitoring data in MySQL? Also Piwik? That's a sick solution, IMHO. :)
(BTW, I love Zabbix and Piwik. Use them both. Only I think that having no good alternate data store at the time of their writing, their data storage is suboptimal.)
What we're missing are similar arricles that go into disadvantages and implications on deployment.
Eg I have found out that deploying Tokyo Tyrant in a Rails project requires you to write some sčripts to ensure that things run properly. Also the db size has to be set in configuration in advance.
MongoDB OTOH is not designed for a single server environment, has a very small max document size, easily gets corrupted if process is stopped etc.
CouchDB & MongoDB both share one property that this comparison misses (or mentions only in passing).
Both are schema free datastores. For me, this is the biggest, most useful difference between them and traditional SQL databases, because it makes things easy that are very, very hard (or inefficient) on an SQL database.
It's probably also worth noting that other NoSQL solutions don't share this advantage. For example, Cassandra requires all nodes to be restarted to apply a schema change, which can be quite a big deal.
Not in the same sense as an SQL database. You can freely add columns and rows, just not Column Families or Keyspaces. This is because a KS+CF combo is stored in it's own file, in a certain order, so that it can be efficiently traversed using natural ordering. If you don't have this need and just need a flat K/V database, you can use a single KS+CF for everything.
Under protocols you may want to specify MongoDB's as BSON and Cassandra's as Thrift. That would be more helpful than "binary/custom".
Updated:
Also Redis's main selling point is it's extensive data structure/operations support. "Blazingly fast" really depends on what your workload is and what you're comparing it against.
I've had both MongoDB and Cassandra perform nearly as well as Memcache when getting a single document/row when the document was in memory in MongoDB and the row was cached (row cache, not just key cache, so again: fully in memory) in Cassandra.
In memory operations are fast in many databases. Redis's default configuration (vm-enabled no) just only does operations in memory (with an occasional sync to disk). That's terrible durability but fantastic performance. Most databases, including Redis, can be configured for either that sort of high-performance/low-durability or the opposite. It's just that their default settings/behaviors vary widely.
Also is VertexDB - small graph database. It's written in C, uses Tokyo Cabinet for storing data. Simple http filesystem-like interface. The general advantage - links, that allow to make graph structures on database level.
You mention that some of these solution could be used in the Financial industry. I would be cautious of using these, especially since some are eventually consistent. If you are just tracking data these may be fine though.
Most of the financial sector is eventually consistent. I'm not just talking about traditional banks, I'm also talking about the markets as well. Most revolve around batch settlement processes where consistency essentially occurs on a schedule. While at the micro level, the components of these systems are somewhat built on ACID databases, that becomes irrelevant when as a system, it functions as eventually consistent. Dynamo itself is based on the fully ACID BerkeleyDB at the node level.
But facebook recently dumped it for inbox feature and began using Hbase instead. Not sure if Sequoia backing Riptano is supposed to be a bug or a feature?
One major feature differentiator is something it doesn't really talk about, though - how conducive is each system to Massive Data?
For example, he kind of has a bone to pick with Cassandra, which is probably justified. But from what little I know, one of the features of Cassandra is that it's designed to scale pretty much to infinity. That may be true of a couple of the others, but for some (like CouchDB) it isn't a design goal at all.
Does anyone have any user amounts about the different no-sql databases? Or just say two most popular ones? I guess some of them will rise above the other's in following years and some will drop. User amounts would indicate which ones have most potential to stay around and be accepted as standard no-sql databases.
So if Cassandra writes are much faster than reads, why would Reddit go that route? Their comment server is consistently breaking on them, and it would seem that a sub-optimal choice of db might be partly to blame.
It's not as lop-sided as this article might have you believe and has largely been mitigated as of late. This is because Cassandra uses read repair, which is a big component of it's strategy to make both reads and writes to scale linearly while also ensuring durability.
What is your suggestion otherwise? Any distributed database that is going to be inexpensive, performant, scalable, and durable will need to use some kind of quorum read repair system. Riak, Voldemort, and Dynamo all use read repair with high levels of production success.
I meant database consistency in the sense of never needing "repair tables"; although as far as I know, CouchDB's automatic conflict resolution is also consistent (based on UUIDs), isn't it?
They're not guaranteed to be the primary revision for a document (ie, the last write could be listed as the conflict) but the data is still there so that client code can resolve the conflict.
Also of note, if the edits that caused the conflict are replicated to other nodes, each node will independently choose the same revision to use as the 'primary' document response.
Bottom line, the choice is deterministic and is guaranteed to be preserved, but the choice may not be the last written revision.
Also, bugs that result in corrupt dbs are treated as major bugs as opposed to a part of the design. I've also not seen reports of index corruption under load, if you have logs or any more information we'd definitely appreciate if you could put that info into a ticket on JIRA [1] or even just mail dev@couchdb.apache.org with details.
Good article, it's a good starting point to let the people to decide where to start in using a NoSQL solution. But what about OrientDB? Do you plan to add it in this feature comparison?
That said I think that picking the good database is something you can do only with a lot of work. Picking good technologies for your project is hard work, so there is to try one, and another and so forth, and even reconsidering after a few years (or months?) the state of the things again, given the evolution speed of the DB panorama in the recent years.
While I'm at it I like to share that in this exact days I'm working at a Redis disk back end. I've already a prototype working after a few days of full immersion (I like to use vacation time to work at completely new ideas for Redis).
The idea is that everything is stored on disk, in what is a plain key-value database (complex values are serialized when on disk), and the memory is instead used as an object cache. It is like taking current Redis Virtual Memory and inverting the logic completely, the result is the same (working set in memory, the rest on disk), but this implementation means that there are no limits on the data you can put into a single instance, that you don't have slow restarts (data is not loaded on memory if not demanded), and there isn't to fork() to save. Keys marked as "dirty" (modified) are transfered to disk asynchronously as needed, by IO threads.
If everything will work as I expect (and initial tests are really encouraging) this means that Redis 2.4 will exit in a few months completely killing the current Virtual Memory implementation in favor of the new "two back ends" design, where you can select if you want to run an in-memory DB or an on-disk DB where memory is just an LRU cache for the working set.