This sounds like something you could just do in SQL and have it all done in mill...

lclarkmichalek · on Nov 10, 2013

For our highly unstructured, untyped, non relational artefacts? It'd be fairly impossible to use it as a datastore in this particular case, and regardless, it would provide little to no speed increase over our current application, as the limiting factor is the CPU cost of the map function.

alecco · on Nov 10, 2013

Maybe you should consider transforming the data to structure it a bit. Have a unique identifier per object and arrays for each seen characteristic with (value, id), and it's reverse index. Then decompose the processing of each object to sub-problems matching n-way the characteristics. It doesn't have to be in SQL, though. I'd try a columnar RDBMS. YMMV

baudehlo · on Nov 10, 2013

I have extremely strong doubts about that. Everything can be modelled in modern databases, and SQL is probably much more powerful than you understand here.

collyw · on Nov 10, 2013

Most of the NoSQL cases I have heard seem to be that they could be done at least as well in SQL.

I have asked a lot of questions on this topic, and no one has yet convinced me (please do if you have a legitimate NoSQL case).

msellout · on Nov 10, 2013

Execution time, after a point, is less important than development time. NoSQL is often faster for development and refactoring, because the schema is easier to change in code than in the database.

Also there's the scale issue. Append-only is helpful. Why have SQL if you can't use all the features?