Hacker News new | past | comments | ask | show | jobs | submit login

An optimally optimized query by Postgres would be effectively mapping and reducing.



And the point is that the converse is definitely not true.

Postgres knows about the structure of your data and where it's located, and can do something reasonably optimal. A generic map/reduce algorithm will have to calculate the same thing as Postgres eventually, but it'll have tons of overhead.

(Also, what is with the fad for running map/reduce in the core of the database? Why would this be a good idea? It was a terrible, performance-killing idea on both Mongo and Riak. Is RethinkDB just participating in this fad to be buzzword-compliant?)


While there have been some truly misguided mapreduce implementations, mapreduce is just a computation model that isn't inherently slower than others: A relational aggregation of the type you get with SQL like:

  select foo, count(*) from bar group by foo
...is essentially a mapreduce, although most databases probably don't use a reduce buffer larger than 2. (But they would benefit from it if they could use hardware vectorization, I believe.)

Mapreduce works great if you are already sequentially churning through a large subset of a table, which is typically the case with aggregations such as "count" and "sum". Where mapreduce is foolish is when you try using mapreduce for real-time queries that only seek to extract a tiny subset of the dataset.


There is no relevant knowledge that Postgres has that RethinkDB lacks that lets it evaluate the query more efficiently (besides maybe a row layout with fixed offsets so that it doesn't haven't parse documents, but that's not relevant to the reported problem). A generic map reduce certainly would have more overhead, obviously, but not running-out-of-memory overhead reported above, just the overhead of merging big documents.

The reason you run queries in "the core" of a database is because copying all the data outside the database and doing computations there would be far worse.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: