Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more fuzzers and libFuzzer harness #553

Merged
merged 32 commits into from
Jan 17, 2022

Conversation

isaacbrodsky
Copy link
Collaborator

@isaacbrodsky isaacbrodsky commented Dec 22, 2021

This adds additional fuzzing harnesses (consult README.md in the PR for which functions should be covered and uncovered - a few are left uncovered since they are trivial) and also adds an adapter from AFL++ to libFuzzer. The idea being we can move the fuzzer harness from https://github.com/google/oss-fuzz/blob/master/projects/h3/h3_fuzzer.c to in repo (replaces #448) and cover all functions.

Possible improvements include testing the build using afl++ itself and hooking this up to OSS-Fuzz.

cc @AdamKorcz

@coveralls
Copy link

coveralls commented Dec 22, 2021

Coverage Status

Coverage decreased (-0.04%) to 98.141% when pulling d501e51 on isaacbrodsky:llvm-fuzzer-harness into a1157d8 on uber:master.

@isaacbrodsky
Copy link
Collaborator Author

The windows failure appears to be related to a problem fixed in #550 about those functions not having DECLSPEC.

Comment on lines +95 to +98
if(ENABLE_LIBFUZZER)
list(APPEND H3_COMPILE_FLAGS -fsanitize=fuzzer,address,undefined)
list(APPEND H3_LINK_FLAGS -fsanitize=fuzzer,address,undefined)
endif()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that these settings don't really work for the other binaries since main gets redefined.

Copy link
Collaborator

@nrabinowitz nrabinowitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great overall - I think my confusion here is mostly my own lack of understanding of fuzzers, not problems with the code, but more comments might be helpful overall.

@@ -44,6 +44,9 @@ option(BUILD_BENCHMARKS "Build benchmarking applications." ON)
option(BUILD_FUZZERS "Build fuzzer applications (for use with afl)." ON)
option(BUILD_FILTERS "Build filter applications." ON)
option(BUILD_GENERATORS "Build code generation applications." ON)
# If ON, libfuzzer settings are used to build the fuzzer harnesses. If OFF, a frontend
# for afl++ is provided instead.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the frontend part of this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

frontend in this context means an implementation of main that accepts arguments in a way that AFL can integrate with. Should I expand on the comments for this option?

H3Index *compacted = calloc(inputSize, sizeof(H3Index));
H3_EXPORT(compactCells)(input, compacted, inputSize);

// fuzz uncompactCells using output of above
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make sense? Would we miss cases for uncompactCells that can't be generated by the output of compact?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A second run of uncompactCells is done below, using the original input data. Not sure if this additional run of uncompact on the compact output is needed but it seemed like a reasonable check to append.


// fuzz compactCells
H3Index *compacted = calloc(inputSize, sizeof(H3Index));
H3_EXPORT(compactCells)(input, compacted, inputSize);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need the error here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I consider failing and returning the appropriate error code acceptable behavior. Unacceptable behavior would be undefined behavior, faulting/crashing/out of bounds reads, etc.

Edit: Oh, I see we use the output of this function later. In this particular case I don't think it's too bad since we are guaranteed that the memory pointed to by compacted has been zeroed by calloc. If this were data on the stack using it in a potentially unintialized state could be bad. I will update this one to check the error code.

}
const inputArgs *args = (const inputArgs *)data;
if (args->sz >= 1024) {
return 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my info - it seems like there's a class of issue here where the user passes in an array and a size that don't match, and we get an invalid array access. Since we can't deal with this (we don't know the array size), are we assuming this is outside the bounds of testing, and is user error? Since this is a pretty serious error, do we need to indicate where users might need to be careful in their own code not to hit this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a serious issue with memory management in C. I considered this case to be out of scope for the fuzzers since I don't know of any way to make H3 resilient to this kind of error.

We should be careful to document the expected array sizes when users pass memory into H3 so this can be correctly implemented on the user side.

return 0;
}
inputArgs args;
memcpy(&args, data, sizeof(inputArgs));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my info - why memcpy here and casts in the other tests?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed because we modify args.str on the next line to ensure it meets the contract that it is null terminated. This cannot be done on the original buffer because of const and because libfuzzer will complain.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth a comment in that case I think

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add. Come to think of it we may wish to address this in the API of stringToH3 by passing the buffer length.

H3Error err = H3_EXPORT(maxPolygonToCellsSize)(geoPolygon, res, &sz);
if (!err && sz < MAX_SZ) {
if (sz < 0) {
printf("Oh no - sz is negative\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused here - should this return 0 and exit? If not, why print the error?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it is it will crash on line 57. Not sure if this will be relevant after #551.

return 0;
}
geoPolygon.holes = calloc(geoPolygon.numHoles, sizeof(GeoLoop));
size_t offset = sizeof(inputArgs);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm quite confused here - isn't the fuzzer only going to provide sizeof(inputArgs) worth of data? Where does the rest of the data after offset come from in that case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will need to check if this will work with AFL (might need to adjust the seed file so it has data after inputArgs). For libFuzzer it does not know about inputArgs so it will generate larger test cases.

src/apps/fuzzers/fuzzerResolutions.c Outdated Show resolved Hide resolved
Comment on lines +34 to +44
double out;
H3_EXPORT(getHexagonAreaAvgKm2)(args->res, &out);
H3_EXPORT(getHexagonAreaAvgM2)(args->res, &out);
H3_EXPORT(getHexagonEdgeLengthAvgKm)(args->res, &out);
H3_EXPORT(getHexagonEdgeLengthAvgM)(args->res, &out);

int64_t outInt;
H3_EXPORT(getNumCells)(args->res, &outInt);

H3Index pentagons[12];
H3_EXPORT(getPentagons)(args->res, pentagons);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably just a reflection of me not knowing how fuzzers work, but when you're doing multiple sequential tests like this, does it become harder to test the last function because you need to "get past" all the previous ones? Is there any value is splitting out each function separately? But that's also a lot of annoying boilerplate...

Copy link
Collaborator Author

@isaacbrodsky isaacbrodsky Jan 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's correct that if any of the previous functions called were to crash, this function wouldn't be exercised. But in that case we'd still have found a crash warranting investigation anyways. If we explicitly terminate the fuzzing run before reaching this function (i.e. if getNumCells were to return an error then we return 0), then that would not be a good test because this function's invocation would be dependent on the previous function. (And that is not part of the contract of calling this function, as with the memory allocation size functions.) That doesn't happen here so I believe this is a valid exercise of all the functions in this file.

@isaacbrodsky isaacbrodsky merged commit 9330ee1 into uber:master Jan 17, 2022
@isaacbrodsky isaacbrodsky deleted the llvm-fuzzer-harness branch January 17, 2022 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants