If I am going to plaster extra asserts and predicates in my code, why not just use a logic programming language or something like SPARK/TLA+ anyway? Not looking for an argument, just asking a question.
I've tried replacing asserts with Mercury code in my C++ pet projects. Felt much easier to insert my thots into the source.
By the way, what do you guys think of predicate-based formal verification? I think overall, C++ Concepts are checked for object class compatibility via a logic engine, since Concepts are predicates.
The biggest win of predicate-based formal verification (or really any formal verification at all) is forcing you to carefully think about invariants and properties of your code (independently of whether you can prove them). Even if you can't prove them, these serve as useful formal documentation for maintenance purposes and help guide and maximize the efficiency of other techniques such as traditional testing. If you can characterize your code with useful invariants and properties, then you can write useful unit tests and property tests, rather than writing happy path tests that fail to actually catch many bugs.
Assertions are one piece of the puzzle, but often times you want to reason about how something fits in a larger context (the canonical example of this is something like the associativity of an operator, which isn't something you can check for with an assert statement within the operator), which is where independent predicates come in handy.
In general, I think of fuzzing and property-based testing as a lightweight and practical stand-in for the verification part of formal verification. If your formal properties can be proved quickly (e.g. by a short manual proof or some sort of automated engine) then great! If they can't, you can still often times convert it into a fuzzing-friendly format or a property test and get a fairly high degree of confidence of correctness that way.
Because they work at different levels and provide different strengths.
Model-checking a TLA+ spec is fully automatic, but the specification is probably at a high level; fuzzing can find low-level implementation problems.
SPARK or languages/tools like it specifies at the code level, but verifying at the same level as fuzzer can be very laborious if at all feasible (same as using the proof assistant with TLA+).
In short, software verfication comes with huge tradeoffs among labor, soundness and scale. There is no one solution that subsumes the others.
Wouldn't it require a set of business rules instead of a comprehensive design model? It's logic programming. Based on those rules it can determine whether the input I passed in should produce the output I received.
The logic programming suggestion would be much less work but is usually overlooked. Probably because that kind of subtle, concise code is the polar opposite of brute force fuzzing.
> Wouldn't it require a set of business rules instead of a comprehensive design model?
Well, I was imagining something like a library being tested/verified rather than an application or system. But either way, you have to model the features of the code-being-verified and the model must be comprehensive. An incomplete reference model usually will result in false defects.
It's a lot easier to sprinkle some assertions throughout an existing legacy application than it is to reimplement that application in something like TLA+
What's a good tutorial for fuzzing? I work with many code bases in my consulting, none of which have come close to being "fuzzed". I've googled around and the learning curve is steeper than expected. I am approaching it like learning a new unit-test framework. Perhaps that's wrong? Is it like integrating a code coverage tool or memory analysis like valgrind? This article seems like a bunch of high-level entreaties to go fuzz etc. maybe I missed something.
It depends really a lot on what your target looks like, both in terms of what interfaces it has and what programming language you're using.
A very simple case that's a good start is if you have C or C++ code and already have some form of command line tool that parses an input file and does something with it noninteractively. Then you can easily use afl, ideally coupled with address sanitizer (a compiler feature to catch memory safety bugs).
If you want to up your game, don't target applications, target functoins. The go-to tool is libfuzzer.
If you leave C/C++ then the answer depends really on what programming language and what you even want to test for. (Looking for memory corruption bugs is just the most common thing to do with fuzzing, so if you don't have those, you need something else to look for.)
> I am approaching it like learning a new unit-test framework.
That's reasonable.
If you just take a step back from afl or libfuzzer and consider how you would test your product with arbitrary inputs, the critical steps that you have to take are pretty similar regardless of the framework.
It has to be noninteractive and it should ideally be something whose duration doesn't vary by many orders of magnitude (so you can set sane timeouts that you consider failures). It shouldn't have any nondeterminism or behaviors that consider other inputs outside of the ones generated by the fuzzing framework. It's great if you can avoid filesystem interaction.
"Generating Software Tests" (https://www.fuzzingbook.org/) is pretty great (independent of your programming language) - arguably a must read for anyone interested in software testing.
John Regehr (the author of the blog post) has written more great posts:
The short version is that you strip down the interesting main loop of your program to accept an item of input, process it, and exit, and then the fuzzer takes over.
I was in the same place for a long time. Recently I found the bar to entry in fuzzing rust code amazingly low. The docs aren't great yet but the tools are super easy to use and well put together. I started with the rust fuzz book: https://rust-fuzz.github.io/book/
One thing I do occasionally is to push /dev/urandom into functions in various tests. Read n bytes, use it as input. Not quite as extensive as fuzzing libraries, but it works.
Speaking of fuzzing, does anybody know of a solution for fuzzing multi-step processes? Suppose I was fuzzing a network application, which requires an entire session for a bug to be discovered. I can't do that with vanilla afl-fuzz; what tool would enable me to fuzz, say, an SSL/TLS library?
> Suppose I was fuzzing a network application, which requires an entire session for a bug to be discovered
I/O is a general problem for fuzzing and IMO the simplest/most general approach is to try and decompose the code under test to find a part that is able to accept a single input stream.
For something like an SSL/TLS library, you could have a fuzzing mode that sets the initial randomness to a fixed value. That lets you create a pre-baked input file that can successfully establish a session, and you'd start the fuzzing from there.
You can do the same thing with session IDs or similar. In general when you're fuzzing you want to fix your random seeds (and date/time inputs etc, as well as reset any external data store modified by the program back to initial conditions) - ideally your program being fuzzed is a pure function of its input.
If you want to fuzz a library by making a series of calls, DeepState has special support for that, but that won't handle some other issues with network applications of course.
> In other words, it detects only crashes. We can do much better than this. Assertions and their compiler-inserted friends — sanitizer checks — are another excellent kind of oracle.
I'm trying to understand this one. Isn't a fuzzer useful for catching unpredicted exceptions? It seems a non-zero or premature exit of a program is one of the best scenarios.
Fuzzing can feed all kinds of bad input into a program where the correct result is to print an error message and/or return an error code. E.g. if I feed a malformed PNG into a texture conversion tool, it probably should print an error and exit nonzero, so we can't automatically mark those as test failures.
On the other hand, if my same texture conversion tool hits a debug assertion about a corrupted heap, that should be a test failure. On the one hand, it was kinda predicted - someone wrote a debug assert to catch it after all - but on the other hand, it certainly wasn't supposed to happen, no matter what kind of bad input we were fed. It indicates we're corrupting memory, quite possibly in a way that won't always do the right thing of exiting non-zero, especially if the debug assertions are disabled in release builds.
Even when your encounter assertions in more predictable code, it's often the case that a non-zero premature exit actually wasn't the desired response. Maybe you wanted to gracefully exit with a better error message. Maybe you wanted to log an error and continue execution when your HTTP server recieves bad client requests, instead of creating a denial-of-service CVE. Maybe you had a bug, and your code is mishandling entirely valid input.
If you're using assertions to indicate "I believe this should never happen" instead of a general error reporting tool, it can be very nice to make them test failures when fuzzing.
> If you're using assertions to indicate "I believe this should never happen" instead of a general error reporting tool, it can be very nice to make them test failures when fuzzing.
I used to work for a company that used asserts as general error checking and it was awful. It was for embedded firmware, and I always ran tests (including security tests) with debug firmware. Non-debug firmware had asserts disabled. On debug firmware, a failed assert would cause a complete crash with the display showing the source code file and line number of the failed assert.
Some developers were using asserts on simple user input. For example, this firmware had a web server with a REST API. One of the APIs would expect a number as a parameter, and the person doing development added an assert() statement to make sure the API received a number and not letters or other characters.
It made security testing a nightmare and most of the assertion failures I ran into were completely pointless.
I'm currently fuzzing some ActiveX controls. Fortunately the top-level code is in JS, so I was able to write a shim to record all the calls and arguments. Then I can replay them to get the control into a valid state before fuzzing individual methods.
I've tried replacing asserts with Mercury code in my C++ pet projects. Felt much easier to insert my thots into the source.
By the way, what do you guys think of predicate-based formal verification? I think overall, C++ Concepts are checked for object class compatibility via a logic engine, since Concepts are predicates.