C-Reduce, a C program reducer

brson · on May 21, 2014

John Regehr gave a presentation to the Bay Area Rust user's group recently about his work. Since then we've started applying C-Reduce to Rust bugs and it has been working surprisingly well. If you've got a curly-brace language then give it a try.

steveklabnik · on May 22, 2014

John's talk was amazing. Watch it here: https://air.mozilla.org/rust-meetup-may-2014/

bjz_ · on May 22, 2014

Indeed! That talk was extremely eye-opening. Combined with a fuzzer, C-Reduce can find some delightfully wacky bugs that few humans would think to check. Here is a great one found by Regehr himself: https://github.com/mozilla/rust/issues/13847

nullc · on May 22, 2014

I've tried using this to test my tests— take some code that has (presumably comprehensive tests) and C-reduce it to the smallest program that passes the test... Then go see what important code it removed and fix the tests until it doesn't pass.

Kinda fun, but tedious because it ruthlessly trashes the formatting, so matching up the before and after code is hard.

dbaupp · on May 22, 2014

That sounds like you're effectively using it as a test coverage tool to find code paths that aren't being tested, would that be an accurate summary?

nullc · on May 22, 2014

Not quite, because code can still get run by a test but not be tested.

This isn't all that interesting to do except on code that already has ~100% branch coverage. (Since obviously any unexecuted code is going to get removed, and lcov is a lot less work to use :) )

There aren't really any good mutation testing tools for C— e.g. tools to test your tests by breaking your code and making sure your tests fail. C-reduce can be used as a limited kind of mutation test— one that only tests reductions.

One nice thing is that for code with 100% coverage any successful reduction is quite often a shortcoming in the tests or a missed optimization in the code, which isn't something you can say for most other kinds of mutations. (except for the fact that c-reduce does a lot of formatting garbling that doesn't really change the code)

dbaupp · on May 22, 2014

Ah, nice. So it's more like a "semantic coverage" tool, i.e. finding the code paths that don't have any effect on the output.

jevinskie · on May 22, 2014

Try running the reduced code through clang-format.

rurounijones · on May 22, 2014

In the ruby world there is the "mutant" ( https://github.com/mbj/mutant ) gem that does stuff like this automatically.

It runs your tests and makes sure they all pass then it mutates the code under test (changing a == b to a != b or a <= 10 to a < 10) and makes sure that at least one of your tests fails.

If your tests still pass after the mutation then you know you are not covering something.

Tis a great idea but making a C version with the equivalent functionality would be a lot harder I think.

michaelmior · on May 22, 2014

Curious why you're thinking doing this in C would be a lot harder? All you really need to do is define what some valid transformations are. There are existing mutation testing tools for C[1].

[1] http://stackoverflow.com/questions/4715890/what-mutation-tes...

lilyball · on May 22, 2014

I wonder how hard it is to write a program that will parse two C source files into AST's, diff the AST's, and then map the AST back to the original files and highlight where all the differences are?

JoshTriplett · on May 22, 2014

Quite difficult if you need to handle arbitrary C, primarily because of the preprocessor; preprocessing is a textual step, and before preprocessing, C code can't necessarily be turned into a sensible AST.

You'd likely get better results by running both source files through an indenter and diffing the results. That's a strong argument for enforcing a canonical style via tools like "go fmt" or "indent".

userbinator · on May 22, 2014

FYI the first bug mentioned in their paper has to do with bitfield signedness/sign extension, or in this particular example, the lack thereof. (In my experience, bitfields in C are more trouble than they're worth...)

The demogroup Farbrausch wrote a code-reduction tool for a slightly different purpose, but with a similar motivation: http://fgiesen.wordpress.com/2012/04/08/metaprogramming-for-...

This also reminds me of the joke http://github.com/mattdiamond/fuckitjs which does almost the exact opposite --- reduces code until it does not have any errors.

alec · on May 22, 2014

See also Delta Debugging for a line-based (non-syntax-aware) version of this. Very useful for cutting down test cases to size. I'm enthusiastic that c-reduce is threaded.

https://www.st.cs.uni-saarland.de/dd/

Moral_ · on May 22, 2014

Here is one of the papers relating to the project: http://www.cs.utah.edu/~regehr/papers/pldi12-preprint.pdf

cubancigar11 · on May 22, 2014

Interesting. I was working on something quite similar when I was in PayPal. Here is a prototype: https://github.com/anupamsr/ut-test

Then the powers that be came and took all the credit. Since then I haven't touched it.

sitkack · on May 22, 2014

While I enjoy the sentiment, ;) one cannot have riders on GPLd code, it either is or isn't.

cubancigar11 · on May 22, 2014

Why? The code is GPLd for those who are allowed to use it. It is not free as in beer. But once you get the beer, you will get to know the recipe.

But if you think there is some other license that better expresses my... sentiment... you are welcome :)

(I know, it sounds kiddish. It was a long time ago and I felt really burned because PayPal management, from lower to top, is seriously fscked ethically. It took me more than a year to get interested in creating something new.)

simias · on May 22, 2014

I believe you cannot add restrictions on top of the GPL, which is one of the reasons OpenSSL is not compatible with the GPL for instance.

As such, if you license your code under the GPL anybody at paypal can use and modify it under the terms of the GPL, as per the terms of the license.

What you can do is have a dual license "as long as the combination of restrictions on the work as a whole does not put any additional restrictions beyond what GPL allows"[1].

That being said, I'm not a lawyer, so I might have misinterpreted the wording of the license.

[1] https://en.wikipedia.org/wiki/GNU_General_Public_License#Com...

sitkack · on May 22, 2014

Correct. One cannot have GPL + some other stuff because it isn't GPL at that point and becomes a huge mess. Imagine trying to track all of the exceptions!

sitkack · on May 22, 2014

I am not sure of a license, and I am not skilled in this matter. You could make your own everybody but ... (list of parties not allowed to use), as the poster below said, it could be dual licensed but I don't think the GPL is compatible with exceptions either above or below.

Safest would be everyone except PayPal.