Hacker News new | past | comments | ask | show | jobs | submit login

The major omission in this article is fuzzing. Not only is it practical and in wide (and growing use), it’s also far more advanced than QuickCheck’s approach of generating random inputs, because fuzzing can be _coverage-driven_. Property-based testing came out of academia and fuzzing came out of security research, initially they were not connected. But with the advent of in-process fuzzing (through libFuzzer), which encourages writing small fuzz tests rather than testing entire programs; and structure-aware fuzzing, which enables testing more than just functions that take a bytestring as input, in my view the two techniques have converged. It’s just that the two separate communities haven’t fully realized this yet.

One pitfall with non-coverage-driven randomized testing like QuickCheck, is that how good your tests are depends a lot on the generator. It may be very rarely generating interesting inputs because you biased the generator in the wrong way, and depending on how you do the generation, you need to be careful to ensure the generator halts. With coverage-driven fuzzing all of these problems go away; you don’t have to be smart to choose distributions so that interesting cases are more common, coverage instrumentation will automatically discover new paths in your program and drill down on them.

But isn’t fuzzing about feeding a large program or function random bytestrings as inputs, whereas property-based testing is about testing properties about data structures? It is true that fuzzers operate on bytestrings, but there is no rule that says we can’t use that bytestring to generate a data structure (in a sense, replacing the role of the random seed). And indeed this is what the Arbitrary crate [1] in Rust does, it gives tools and even derive macros to automatically generate values of your data types in the same way that QuickCheck can. The fuzzing community calls this Structure-Aware Fuzzing and there is a chapter about it in the Rust fuzzing book [2]. There are also tools like libprotobuf-mutator [3] that substitute fuzzers’ naive mutation strategies, but even with naive strategies fuzzers can usually get to 100% coverage with appropriate measures (e.g. recomputing checksums after mutation, if the data structure contains checksums).

I am using this extensively in my own projects. For example, RCL (a configuration language that is a superset of json) contains multiple fuzzers that test various properties [4], such as idempotency of the formatter. In the beginning it used the raw source files as inputs but I also added a more advanced generator that wastes less cycles on inputs that get rejected by the parser. The fuzzer has caught serveral bugs, and most of them would have had no hope of being found with naive randomized testing, because they required a cascade of unlikely events.

Structure-aware fuzzing is not limited to generating data structures either, you can use it to generate reified commands to test a stateful API, as you describe in the _Stateful property-based testing_ section. The Rust fuzzing book has an example of this [5], and I use this approach to fuzz a tree implementation in Noblit [6].

[1]: https://docs.rs/arbitrary/latest/arbitrary/ [2]: https://rust-fuzz.github.io/book/cargo-fuzz/structure-aware-... [3]: https://github.com/google/libprotobuf-mutator [4]: https://docs.ruuda.nl/rcl/testing/#fuzz-tests [5]: https://rust-fuzz.github.io/book/cargo-fuzz/structure-aware-... [6]: https://github.com/ruuda/noblit/blob/a0fd1342c4aa6e05f2b1c4e...




Agreed. A while back I played around with fuzzcheck [1], which let's you write coverage-guided, structure-aware property tests, but the generation is smarter than just slamming a fuzzer's `&[u8]` input into `Arbitrary`. It also supports shrinking, which is nice. Don't know that I would recommend it though. It seemed difficult to write your own `Mutator`s. It also looks somewhat unmaintained nowadays, but I think the direction is worth exploring.

[1]: https://github.com/loiclec/fuzzcheck-rs/


I spent some time looking at the arbitrary crate in Rust and was left unsatisfied at the shrinking story, which I think is 90% of the value of PBT.


Do you have a model problem which is tricky to shrink?

I implemented a stupid simple shrinker for arbitrary, and I’d love to know a specific example where it fails to shrink in a good way:

https://github.com/matklad/arbtest/blob/0191f93846e9f7e38254...

I know at lest two interesting approaches for making that way smarter, but I don’t yet have a problem where my dumb approach isn’t sufficient.


I think I'll have to try it out! I'm just dubious about approaches that don't put shrinking front and center.


Shrinking is built in to libFuzzer! (`-minimize_crash` in libFuzzer, `cargo fuzz tmin` with cargo-fuzz)


I remember libFuzzer's shrinking not doing a great job for sophisticated structured inputs, but maybe it's worth another look!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: