I am not a MongoDB expert by any stretch, so please correct me where I am wrong.
The way I read your link it sounds like I need to store the data in a particular way in order to run a parent/child query. That is great if I know that I need that query at design time. What happens if I have tens of millions of records and need to run that report on an ad-hoc basis? Where the relationship may or may not be important?
What if I want to sum the "cost of goods sold" one day and the "items per transaction" the next? Does that not require someone to write code more complex than SQL? Because on Oracle a business analyst can open up Toad and run that query.
If I am wrong then it very well may be that the problem is one of the enterprise not being aware.
Ease of use is subjective. I dont think writing a mapreduce job needs to be more complex than writing an equivalent SQL query. What really matters is elegance, flexibility and power.
My personal experince is that the MongoDB model seems to win in most cases. Especially when it comes to flexibility and ad-hoc querying. Having a real language (javascript) and a flexible schema tend to make most business problems easier to express.
Ease of use is subjective. I dont think writing a mapreduce job needs to be more complex than writing an equivalent SQL query. What really matters is elegance, flexibility and power.
Unfortunately it seems you have completely misunderstood the nature of both SQL and MapReduce. MapReduce is a distributed computation engine. While it can be used in that way it was never meant to be a database system. BigTable is proof enough of that.
In general, SQL is the syntactical representation of relational algebra with some hacky additions for programmer convenience. Comparing just "SQL" to the MongoDB language model is misguided since you then break down to a question of algebraic expressivity and relational power.
I'm not going to try and build a proof here but we do know that a formal relational algebra system is equivalent to first-order logic. As far as MongoDB's relational language goes, one would probably have to make an argument that it is equivalent to either tuple or domain relational calculus, but I know of no theoretical work that has attempted this. If anyone has any more information to the theoretical expressiveness of the MongoDB relational system I would love to read it.
I was not arguing about relational algebra or theoretical expressivity or logical equivalence or anything like that. I was simply stating that in practice most business problems are easier to model and more flexible to query in the MongoDB model.
Of course you need some time get used to thinking in terms of documents rather than tables and rows. But once you get used to the idea you can easily model most domains that occur in practice.
> MapReduce is a distributed computation engine. While it can be used in that way it was never meant to be a database system. BigTable is proof enough of that.
Yes, MapReduce in the Google and Hadoop sense is designed for massive batch processing. That's why BigTable and HBase exists. MapReduce in the CouchDB and MongoDB sense is a Turing complete query and processing layer built on top of a column store. In the CouchDB case MapReduce is the only way you can query the database.
The way I read your link it sounds like I need to store the data in a particular way in order to run a parent/child query. That is great if I know that I need that query at design time. What happens if I have tens of millions of records and need to run that report on an ad-hoc basis? Where the relationship may or may not be important?
What if I want to sum the "cost of goods sold" one day and the "items per transaction" the next? Does that not require someone to write code more complex than SQL? Because on Oracle a business analyst can open up Toad and run that query.
If I am wrong then it very well may be that the problem is one of the enterprise not being aware.