How to Architect a Query Compiler, Revisited [pdf]

rwhaling · on Aug 27, 2018

It's always nice to see the papers that come out of Tiark Rompf's group. For background, here are some other great ones:

Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs https://infoscience.epfl.ch/record/150347/files/gpce63-rompf...

Collapsing Towers of Interpreters: https://www.cs.purdue.edu/homes/rompf/papers/amin-popl18.pdf

LMS-Verify: Abstraction Without Regret for Verified Systems Programming: http://lampwww.epfl.ch/~amin/pub/lms-verify.pdf

There's a ton more. Nada Amin's talks on the last two papers are excellent also:

https://www.youtube.com/watch?v=QuJ-cEvH_oI https://www.youtube.com/watch?v=Ywy_eSzCLi8

Oh, and I forgot my favorite: Functional Pearl: A SQL to C Compiler in 500 Lines of Code https://www.cs.purdue.edu/homes/rompf/papers/rompf-icfp15.pd...

Some really mind-blowing stuff.

dman · on Aug 27, 2018

Thanks for taking the time to share pointers to other papers! If there is one thing I like better than a post linking to a paper I want to read, its a post linking to a paper I want to read and with comments linking to other related work.

zokier · on Aug 27, 2018

Worth pointing out that "Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs" is also co-authored by Martin Odersky of Scala fame. Definitely looks like good stuff.

tiarkrompf · on Aug 28, 2018

No surprise here, as Martin was my PhD advisor :)

tiarkrompf · on Aug 28, 2018

Thank you -- made my day!

tiarkrompf · on Aug 28, 2018

Hi all,

Author here, great to see our paper on the front page today. Happy to answer any questions!

Since Spark was already mentioned in the comments, you might be interesting in the parallel/follow-on work we did applying similar techniques to accelerate Spark:

"Flare: Optimizing Apache Spark for Scale-Up Architectures and Medium-Size Data" (to appear at OSDI, preprint here: https://www.cs.purdue.edu/homes/rompf/papers/essertel-osdi18...)

If you're running Spark in production and are interested in trying it out, please leave your contact data here: https://flaredata.github.io. We're a small team and have a long waitlist, but should have some open slots in our private beta program this Fall.

Cheers, - Tiark

chrisaycock · on Aug 28, 2018

I'm glad to see the paper specifically contrast iterator approaches (pull model) versus a "data-centric" approach (push model). This was indeed the issue I've seen with Apache Spark's RDD.

Imagine a common pipeline:

  - input data
  - process data
  - output data

Using iterators, the output logic loops over an abstract next() interface provided by the process logic, which in turn loops over another abstract next() interface provided by the input logic. Given a long enough chain of dependent processes, the CPU's cache will be obliterated.

If we were to write code in a regular programming language by hand, we would never use such an inverted model. Instead, we would directly input the data, process it, and then output it. That's exactly what the paper proposes; the authors supply a callback function at the major stages of query evaluation. They use this model in the context of writing a compiler to speed query execution.

It's worth noting that the Spark maintainers also realized that iterators have bad performance and that newer versions of Spark follow the compiler approach:

https://databricks.com/blog/2016/05/23/apache-spark-as-a-com...

atombender · on Aug 28, 2018

As someone working on exactly such a query engine, this is very interesting, thanks.

It's worth noting that Rust uses iterators extensively, yet is able to get around much of the issues with the pull model by being able to optimize and inline iteration loops.

I believe its starting point for getting good performance here is by making iterators — i.e., the std::iter::Iterator trait — a first-class trait. For example, "for v in iter" knows about how to loop over iterators, and "for v in values" sees if "values" implements the IntoIter trait to turn it into an iterator. Rust also avoids vtable lookup when the value is not a polymorphic boxed trait. Together, this means it can aggressively inline and unroll iteration loops. I'm sure they use other tricks, as well.

voidmain · on Aug 28, 2018

Not all kinds of query plans can be expressed as nested loops, though. Index intersections and other ordered merge operations are very useful, and how do you do them with only nested loop plans?