- Start Date: 2018-04-16
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)
We can now fuzz rust code with 3 different fuzzers (AFL, libFuzzer, and honggfuzz) through 3 different projects (afl.rs, libfuzzer-sys, and honggfuzz-rs respectively).
Even though the general principles of usage are the same in those projects, they use different conventions and APIs making it unnecessarily difficult to fuzz one project with all available fuzzers.
The minimal goal of this RFC is to harmonize the conventions, APIs and as much as possible the user interfaces.
A logical extension of this goal would be to build a new project abstracting all 3 fuzzers behind a common denominator API for an even easier use. (to be more specific, I'm not thinking about cargo-fuzz but something lower level, which could then be used by cargo-fuzz)
Differents fuzzers have different sets of strengths and weaknesses and therefore can be used in complementary ways to fuzz one codebase.
Being able to easily use all 3 fuzzers on one code base will help uncover more the rust ecosystem.
to complete...
As a minimum, we should harmonize the code API and general project structure.
libfuzzer-sys (persistent fuzzing) : executable entry point provided by libfuzzer.a
fuzz_target!(|data: &[u8]| { ... }
afl.rs (forking fuzzing): executable entry point provided by rust
afl::read_stdio_bytes(|string| { ... }
honggfuzz-rs (persistent fuzzing): executable entry point provided by rust
loop {
fuzz!(|data: &[u8]| { ... }
}
or
loop {
fuzz(|data| { ... }
}
- libfuzzer only works in persistent mode (by design)
- AFL supports forking and persistent (https://lcamtuf.blogspot.fr/2015/06/new-in-afl-persistent-mode.html) but persistent mode in not implemented in AFL.rs.
- honggfuzz supports both and honggfuzz-rs uses persistent.
- persistent mode is (a lot)faster and the advantages of forking mode are not relevant for our use case.
I think we should port afl.rs
to use persistent fuzzing for performance and consistency with the others.
-
targets from
afl
andhonggfuzz
uses a file descriptor to read instructions from an external parent process.afl.rs
andhonggfuzz-rs
let the fuzzed target define their ownmain
and do some setup work and then call a function or a macro (defined by the rust fuzzing project) that will then call a function (defined in a C static library from the upstream project) that will read and interpret the instructions from the FD and pass a vector of bytes to a closure provided by the end-user. -
libfuzzer
takes the form of a static C library linked to the fuzzed code. Thelibfuzzer
static library provides the entry point of the final executable.libfuzzer
then internally calls the function to fuzz in the executable. This is very much incompatible with the other fuzzers and makes use of setup code difficult.
I think we should try to find a way to start libfuzzer
from rust code.
- the macro has closure-like syntax allowing the user to choose the data type thanks to the
Arbitrary
crate. - if you only want a slice of bytes, using a macro is unnecessary, a function suffice.
Maybe, we could provide both.
When the user provides a closure to fuzz to our function or macro, this closure needs to be called ad vitam aeternam (assuming persistent fuzzing).
The question is where to put this infinite loop?
interior loop:
// library code
pub fn fuzz<F>(closure: F) where F: Fn(&[u8]) {
loop {
let data = get_bytes_from_parent_fuzzer();
closure(data);
}
}
// user code
fn main() {
fuzz(|data| {
api_to_fuzz(data);
})
}
exterior loop:
// library code
pub fn fuzz<F>(closure: F) where F: Fn(&[u8]) {
let data = get_bytes_from_parent_fuzzer();
closure(data);
}
// user code
fn main() {
loop {
fuzz(|data| {
api_to_fuzz(data);
})
}
}
Advantages of the interior loop
- less user code
- less error-prone (user cannot forget to write the loop)
Advantages of the exterior loop
- more explicit behavior
- can work in more use cases, see example below.
An example that can only be implemented when using exterior loop:
fn main() {
// this function provides an infinite stream of objects and we want to fuzz
// each instance only once.
some_function_from_user_api(|object_from_user_api|{
fuzz!(|data| {
object_from_user_api.method(data);
})
})
}
}
fuzz
/ fuzzing
/ fuzz_target
/ fuzzer::read_bytes
/ fuzzer::fuzz_with_bytes
...
It has also be proposed to use a syntax similar to cargo bench
#[fuzz]
fn test_fuzz(bytes: Vec<u8>) {
...
}
AFL
andlibFuzzer
can be launched many times to use many cores but it requires some setup and the multiple processes needs to be started manually.honggfuzz
takes care of everything.
We should try to do something about AFL
and libFuzzer
as nowadays CPUs often have >8 logic cores.
example:
Cargo.toml
src/
...
target/
debug/
release/
fuzz/
afl/
artifacts/
target/
libfuzzer/
artifacts/
target/
honggfuzz/
artifacts/
target/
instrumented/
not-instrumented/
debug/
seeds/
There are many questions:
- at the root crate directory, should we crate only one fuzzing directory (like
fuzz
orfuzzing
) or more (fuzz_targets
+fuzz_workspace
)?- or even put it/them in the
target
directory?
- or even put it/them in the
- each fuzzer will create one or many builds, should we put them somewhere in the
fuzz
directory or in the top leveltarget
directory alongsidedebug
andrelease
?- the latter solution will need some cooperation with the
cargo
team.
- the latter solution will need some cooperation with the
- should all fuzzers share their generated artifacts? (they are not 100% compatible)
fuzzer_name/{artifacts/target}
versus{artifacts/target}/fuzzer_name
versus something else?-
- a lot of bikeshedding
- cargo tasks ...
- There is a risk of losing functionalities that are specific to some fuzzers.
cargo-fuzz
provides a wrapper to easily crate fuzz targets by taking care of most of the boilerplate code creation. It could be possible to teach cargo-fuzz
to generate boilerplate code compatible with all 3 fuzzers but I don't think it's the best solution.
I see providing a "high level wrapper" and a "low level generic API" as two orthogonal goals but both could be provided by cargo-fuzz
.
cargo-fuzz
is working in a similar but IMO different design space.
see alternatives
Some of the proposed solutions will probably require some help/assistance from the upstream fuzzers and from the cargo team.