I am not a MongoDB expert by any stretch, so please correct me where I am wrong....

aloneinkyoto · on Oct 1, 2010

Ease of use is subjective. I dont think writing a mapreduce job needs to be more complex than writing an equivalent SQL query. What really matters is elegance, flexibility and power.

My personal experince is that the MongoDB model seems to win in most cases. Especially when it comes to flexibility and ad-hoc querying. Having a real language (javascript) and a flexible schema tend to make most business problems easier to express.

Locke1689 · on Oct 1, 2010

Ease of use is subjective. I dont think writing a mapreduce job needs to be more complex than writing an equivalent SQL query. What really matters is elegance, flexibility and power.

Unfortunately it seems you have completely misunderstood the nature of both SQL and MapReduce. MapReduce is a distributed computation engine. While it can be used in that way it was never meant to be a database system. BigTable is proof enough of that.

In general, SQL is the syntactical representation of relational algebra with some hacky additions for programmer convenience. Comparing just "SQL" to the MongoDB language model is misguided since you then break down to a question of algebraic expressivity and relational power.

I'm not going to try and build a proof here but we do know that a formal relational algebra system is equivalent to first-order logic. As far as MongoDB's relational language goes, one would probably have to make an argument that it is equivalent to either tuple or domain relational calculus, but I know of no theoretical work that has attempted this. If anyone has any more information to the theoretical expressiveness of the MongoDB relational system I would love to read it.

aloneinkyoto · on Oct 1, 2010

I was not arguing about relational algebra or theoretical expressivity or logical equivalence or anything like that. I was simply stating that in practice most business problems are easier to model and more flexible to query in the MongoDB model.

Of course you need some time get used to thinking in terms of documents rather than tables and rows. But once you get used to the idea you can easily model most domains that occur in practice.

> MapReduce is a distributed computation engine. While it can be used in that way it was never meant to be a database system. BigTable is proof enough of that.

Yes, MapReduce in the Google and Hadoop sense is designed for massive batch processing. That's why BigTable and HBase exists. MapReduce in the CouchDB and MongoDB sense is a Turing complete query and processing layer built on top of a column store. In the CouchDB case MapReduce is the only way you can query the database.

http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views http://www.mongodb.org/display/DOCS/MapReduce http://www.mongodb.org/display/DOCS/Aggregation#Aggregation-...