write fuzz inputs to a shared memory region before running a task #20803

andrewrk · 2024-07-26T05:37:55Z

Extracted from #20773.

Currently, a fuzz test failure looks like this:

andy@bark ~/t/abc> zig build test --fuzz 
test
└─ run test failure
/home/andy/local/lib/zig/std/testing.zig:546:14: 0x11575c9 in expect (test)
    if (!ok) return error.TestUnexpectedResult;
             ^
/home/andy/tmp/abc/src/main.zig:28:5: 0x1157691 in test.fuzz example (test)
    try std.testing.expect(!std.mem.eql(u8, "canyoufindme", input_bytes));
    ^
failed with error.TestUnexpectedResult
error: the following command exited with error code 1:
/home/andy/tmp/abc/.zig-cache/o/eea1979fed4d51bc1ca0d161af979e22/test --seed=0x48fe2aeb --listen=- 
error: all fuzz workers crashed
error: the following build command failed with exit code 1:
/home/andy/tmp/abc/.zig-cache/o/bc5565bbec3a56db01acb2ab6b348742/build /home/andy/local/bin/zig /home/andy/local/lib/zig /home/andy/tmp/abc /home/andy/tmp/abc/.zig-cache /home/andy/.cache/zig --seed 0x48fe2aeb -Zb2ede88d1c7627c9 test --fuzz

If you rerun that command that it printed, it does not in fact reproduce the issue:

andy@bark ~/t/abc [1]> /home/andy/tmp/abc/.zig-cache/o/eea1979fed4d51bc1ca0d161af979e22/test --seed=0x48fe2aeb
All 2 tests passed.
1 fuzz tests found.

This is due to lack of communication between parent process (build runner) and fuzzing process (test runner).

However, for performance purposes, we don't want any communication between those processes in the hot path. That means we cannot send a message containing the current input before trying it.

Options are:

Follow the lead from other fuzzers by having a "corpus" directory, which is a list of files memory mapped into the fuzzer process, one per "interesting" input, with filenames corresponding to the run IDs. Advantages to this approach is that it's easy to recover and it could be used to share state across processes. Disadvantage is that it writes to the filesystem in a hot path. Maybe that's OK in practice? I'll have to check.

Another idea that I had is to have the parent process (build runner) create and share a memory mapping with the fuzzing process (test runner). The fuzzer would use this memory to store its most recent input(s) as well as some metadata (for example stats to display on the UI). The parent process can then read from this shared mapping to display the stats in real time as well as to recover inputs when the fuzzer process crashes.

It might not be such a bad idea to send a message when an "interesting" input is found. This message would perhaps be forwarded to other fuzzing processes, perhaps on the same system or perhaps even on other systems. Then again, using a file system directory as a "corpus" directory would also allow other processes, including peers and parents, to notice and pick up interesting inputs.

This issue is a tad bit open ended, but at least to close it, interesting inputs that are found should be displayed in a reproducible manner, where re-running a particular command will in fact reproduce the crash.

The text was updated successfully, but these errors were encountered:

andrewrk · 2024-08-08T00:20:56Z

I'm thinking the next step here is to use .zig-cache/f for corpus directory, keyed on the fully-qualified unit test names, and then implement AFL's strategy of maintaining a minimal set of inputs that trigger unique execution paths as memory-mapped files in this directory. At some point users may then decide to minimize the inputs and then copy them into the source tree, switching over to providing them via std.testing API.

gcoakes · 2024-08-11T14:44:05Z

fuzzer would use this memory to store its most recent input(s)

The current implementation as of today has a shared, memory-mapped file at .zig-cache/v/<program_counter_digest>. I don't think that is the appropriate place to map the current input since that would prohibit parallel processes from fuzzing the same set of program counters. Also, there is currently an assumption that a single test function will be fuzzed within a given test process. Would it be a good idea for each process to use a shared, memory-mapped file as the backing for Fuzzer.input located at .zig-cache/f/<test_fqn>/<pid>. It could be renamed by the parent process when a crash occurs, or it could be renamed by the fuzzing process when an error occurs.

keyed on the fully-qualified unit test names

Does fully-qualified additionally include a build ID for that build of the test? Or, would you want subsequent builds to retain the same cached "interesting" inputs? If the latter, I think we would need to add a phase in which it reanalyzes the cached inputs according to the current program counters.

breaking change to the fuzz testing API; it now passes a type-safe context parameter to the fuzz function. libfuzzer is reworked to select inputs from the entire corpus. I tested that it's roughly as good as it was before in that it can find the panics in the simple examples, as well as achieve decent coverage on the tokenizer fuzz test. however I think the next step here will be figuring out why so many points of interest are missing from the tokenizer in both Debug and ReleaseSafe modes. does not quite close #20803 yet since there are some more important things to be done, such as opening the previous corpus, continuing fuzzing after finding bugs, storing the length of the inputs, etc.

andrewrk · 2025-02-12T07:37:10Z

commit message says

does not quite close #20803 yet

andrewrk added enhancement Solving this issue will likely involve adding new logic or components to the codebase. fuzzing labels Jul 26, 2024

andrewrk added this to the 0.14.0 milestone Jul 26, 2024

andrewrk mentioned this issue Jul 26, 2024

integrate fuzz testing into the build system #20773

Merged

andrewrk mentioned this issue Aug 6, 2024

introduce a fuzz testing web interface #20958

Merged

andrewrk mentioned this issue Aug 8, 2024

enhance the fuzzing algorithm to be competitive with other mainstream fuzzers #20804

Open

gcoakes mentioned this issue Aug 15, 2024

Add cross-platform memory map abstraction and use it in libfuzzer. #21083

Open

ProkopRandacek mentioned this issue Aug 29, 2024

Less basic fuzzer #21246

Closed

andrewrk mentioned this issue Sep 14, 2024

Fuzz tests are not discovered by test runner when re-running with same --seed argument #21410

Open

andrewrk modified the milestones: 0.14.0, 0.15.0 Feb 10, 2025

andrewrk mentioned this issue Feb 11, 2025

fuzzer: write inputs to shared memory before running #22862

Merged

andrewrk closed this as completed in #22862 Feb 12, 2025

andrewrk reopened this Feb 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

write fuzz inputs to a shared memory region before running a task #20803

write fuzz inputs to a shared memory region before running a task #20803

andrewrk commented Jul 26, 2024 •

edited

Loading

andrewrk commented Aug 8, 2024

gcoakes commented Aug 11, 2024

andrewrk commented Feb 12, 2025

write fuzz inputs to a shared memory region before running a task #20803

write fuzz inputs to a shared memory region before running a task #20803

Comments

andrewrk commented Jul 26, 2024 • edited Loading

andrewrk commented Aug 8, 2024

gcoakes commented Aug 11, 2024

andrewrk commented Feb 12, 2025

andrewrk commented Jul 26, 2024 •

edited

Loading