-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
better fuzzing #68
better fuzzing #68
Conversation
I'd like to keep the fuzzing code running as similar to a release build as possible. My rationale is that memory safety issues are often subtle, and although it's unlikely that different tree constants would cause any issues, it's not impossible. And as I mentioned in #66, the randomized tests for the |
The reason I submitted this as a PR at all was because it was able to catch #67 which can cause memory issues in release builds too, the fuzzer was just unable to find the case (or did not run long enough). It still took a while even with this fuzzer so a proptest would not have found that bug. Maybe we could put this behind a feature flag so we can fuzz both ways? |
I think I need to come back to this PR later. My covid brain + jet lag is leaving me in a weird emotional state, where I just really really want to keep the separation between fuzz testing and property testing very strict. But I don't think I'm evaluating things rationally right now. I'll revisit this when I have my head on straight again. |
Okay, so now that my head is on straight again, I agree that this is fine as a fuzz test. However, I do still want the main fuzz testing to use release tree constants, for the reasons I outlined above.
I think something along these lines makes sense. I'm wondering if the fuzz testing framework allows using different feature flags for different fuzz executables? That would be ideal, because then we wouldn't even have to think about it: the fuzz tests that need the smaller tree constants will always use them, and the rest will always use the release constants. Also: it looks like this PR includes the changes from #67? Can you split that out, so we can handle them separately? Edit: I mean, make this the PR that adds the fuzz testing, and leave #67 to just add the fix. |
9ef71cb
to
eebe674
Compare
Yeah I added the change from #67 originally because I assumed that you required fuzzing to pass for merging a PR (and fuzzing fails almost imminently with that fix with the smaller chunk size). I assumed that fix would get merged first. I have dropped the commit now. I have now put the smaller chunk size behind a feature flag and restored the old fuzz target. I added my modified versin as a new fuzz target that requires the
Its a bit ugly but at least cargo will refuse to run the fuzzer if you don't specify this feature flag (while keeping the old fuzz target unchanged). |
Cargo.toml
Outdated
@@ -17,6 +17,11 @@ cr_lines = [] # Enable recognizing carriage returns as line breaks. | |||
unicode_lines = ["cr_lines"] # Enable recognizing all Unicode line breaks. | |||
simd = ["str_indices/simd"] | |||
|
|||
# Internal feature: Not part of public stable API | |||
# enables a much smaller chunk size that makes it | |||
# esier to catch bugs without requiring huge text sizes during fuzzing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typos:
- esier - > easier
- Missing period at the end of the sentence.
Once the typos are fixed, this looks ready to merge. Thanks! |
eebe674
to
9be20bf
Compare
I have fixed the typo and added the period. Should be ready to now :) |
Awesome, thanks! I think that leaves just the lines iterator PR. That one's going to take a bit, because I'd like to make sure I fully understand the implementation. |
I am currently in the process of fixing documentation there and moving the lines iterator back. I get that the lines iterator is super hard to review. |
Ah, thanks for the offer! I'll let you know if I think that'll be helpful. But I haven't properly dived into the code yet, so I'll wait until at least trying first. :-) |
While working on #66 I fuzzed the hash implementation.
I noticed that the fuzzer took extremely long to find any results, because chunk boundaries are quite rare (so huge strings are required).
By using
cfg(fuzzing)
in addition tocfg(test)
for enabling a much smaller fuzz size, chunk sizes are much more common and thefuzzer finds problems much quicker.
For example with the change the bug from #67 is quickly detected.
I have removed the
medium.txt
case from the fuzzer because it slows the fuzzing to a crawl due to the huge amount of chunks.I think the main point of the file was to test chunk boundaries anyway which now also occur for
small.txt
and randomly generated text.