Early Superoptimizer Results

darksaints · on May 15, 2014

From what I understand, the purpose of the project was to amass an enormous database of code patterns and their SMT-proven more optimal replacement code. Is there also a project to apply the harvested optimizations against code?

Also, couldn't the same principle apply to compilers that emit LLVM bitcode, like GHC and rustc?

regehr · on May 16, 2014

The immediate goal is simply to inspire LLVM hackers to improve the instruction combiner.

But yes, these results can be made persistent and used directly to optimize code. We are working on it.

Applying this tool to code emitted by GHC or rustc is trivial, it's all LLVM bitcode. One of my ideas is that these languages are a good use case for a superoptimizer since the LLVM passes aren't tuned for them, but we'll have to see how that plays out.

cokernel_hacker · on May 16, 2014

There is already indication that this has occurred:

Optimize integral reciprocal (udiv 1, x and sdiv 1, x) to not use division: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-...

Optimize signed icmp of -(zext V): http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-...

Optimize -x s< cst: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-...

What's neat is that not all of these end up needing the InstCombine pass to get optimized, some of them can be handled using InstSimplify. InstSimplify isn't a pass, it's an analysis that get's called upon many times in the middle of other passes, strengthening them.

jevinskie · on May 16, 2014

> Finally C-Reduce has missed some opportunities to fold constants, so for example we see ~1 instead of -2 in the 2nd example from the top.

Do you compile the code using clang -O<level> or optimize using opt? I would think that InstCombine would canonicalize that for you so Souper would have less work to do.

regehr · on May 16, 2014

Sorry if I was unclear. The missing folds are at the C level. At the LLVM level we definitely run InstCombine before Souper (otherwise we would find lots of optimizations that are not actually missing).

WalterBright · on May 16, 2014

I remember a paper on "Superoptimization" back in the 80's. It worked by starting with an algorithm, and doing an exhaustive search of every instruction sequence to see if the sequence implemented the algorithm. Of course, it only worked for very small algorithms, but it produced some very interesting results that were promptly incorporated into about every compiler.

cokernel_hacker · on May 16, 2014

It was almost certainly "Superoptimizer -- A Look at the Smallest Program" [1], authored by the ingenious Dr. Massalin.

Dr. Massalin is also responsible for "The Synthesis Kernel" [2], a kernel written around the idea of wringing out performance.

Definitely worth a look for speed demon-type programmers like compiler engineers.

[1] http://www.stanford.edu/class/cs343/resources/superoptimizer... [2] https://www.usenix.org/legacy/publications/compsystems/1988/...

wiml · on May 16, 2014

Might well have been the predecessor to or inspiration for GNU Superopt (which has been moribund for twenty years now...).

zurn · on May 16, 2014

SMT solvers sound interesting and this Python example looks very approachable.

Anyone know if there are other cookbook-type "here's how you solve practical problem x with SMT" recipes that might serve as a cargo cult-type (theory-oblivious) dive into SMT?

girvo · on May 16, 2014

So, I find compiler construction and optimisation a fascinating topic, but have nowhere near enough knowledge to understand this beyond the surface... However, "Superoptimiser" is such a cool name.

_qc3o · on May 16, 2014

Some resources that I've found useful: http://nathansuniversity.com/, "Compiler Design: Virtual Machines", "Compiler Design: Analysis and Transformation", "Programming Language Pragmatics", https://github.com/kvalle/diy-lisp, http://codon.com/, http://www.itu.dk/people/sestoft/pebook/. A few more I don't remember but that's a good selection both for beginners and intermediate practitioners.

girvo · on May 16, 2014

Brilliant, thanks very much! Gives me some more reading, I'm currently working through SICP :)

_qc3o · on May 16, 2014

Ya, that's a really good one.

evntdrvn · on May 16, 2014

Would it be possible to use these techniques on the various JITs that are used in Firefox/Cheomium? How hard would that be? The work you folks are doing is wonderful.

twhume · on May 16, 2014

I did my Master's dissertation on superoptimization and the JVM. Tl;Dr managed to find shorter versions of simple Java math functions fairly easily, didn't prove they were any faster or better.

https://docs.google.com/file/d/0B_8w6H4BG5E_TmxwbkRKRnhUM0k/...

And

http://onlinelibrary.wiley.com/doi/10.1002/spe.2240/abstract

axman6 · on May 16, 2014

Well, if the work gets merged into/applied to LLVM, then Safari should get these benefits "for free" under the new FTLJIT.