Skip to content

Conversation

@BrennanConroy
Copy link
Member

No description provided.

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new fuzzer called Utf8JsonReaderFuzzer to the .NET runtime fuzzing infrastructure. The fuzzer is designed to comprehensively test JSON deserialization functionality by comparing behavior between span-based and sequence-based Utf8JsonReader instances.

Key changes:

  • Implements a comprehensive JSON reader fuzzer that tests various data types and scenarios
  • Adds pipeline configuration to deploy the fuzzer to OneFuzz testing infrastructure
  • Includes testing for both primitive types and complex POCOs with dynamically generated types

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
src/libraries/Fuzzing/DotnetFuzzing/Fuzzers/Utf8JsonReaderFuzzer.cs Main fuzzer implementation with comprehensive JSON deserialization testing
src/libraries/Fuzzing/DotnetFuzzing/DotnetFuzzing.csproj Project file updated to include the new fuzzer
eng/pipelines/libraries/fuzzing/deploy-to-onefuzz.yml Pipeline configuration to deploy the fuzzer to OneFuzz

}
catch (InvalidOperationException ex) when (ex.Message.Contains("surrogate"))
{
// If the string is not a valid DateTime, we expect an exception
Copy link

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is incorrect - the code is testing GetString() method, not DateTime parsing. The comment should reflect that this handles invalid surrogate characters in strings.

Suggested change
// If the string is not a valid DateTime, we expect an exception
// If the string contains invalid surrogate characters, we expect an exception

Copilot uses AI. Check for mistakes.
}
catch (InvalidOperationException ex) when (ex.Message.Contains("surrogate"))
{
// If the string is not a valid DateTime, we expect an exception
Copy link

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is incorrect - the code is testing GetString() method, not DateTime parsing. The comment should reflect that this handles invalid surrogate characters in strings.

Suggested change
// If the string is not a valid DateTime, we expect an exception
// If the string contains invalid surrogate characters, we expect an exception

Copilot uses AI. Check for mistakes.
if (spanArgumentEx?.Message.Equals(seqArgumentEx?.Message) is not true)
{
throw new InvalidOperationException(
$"Span and Sequence readers diverged in exponent exception:{Environment.NewLine}" +
Copy link

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error message incorrectly refers to 'exponent exception' when handling 'same key' ArgumentException. Should be 'same key exception' or similar.

Suggested change
$"Span and Sequence readers diverged in exponent exception:{Environment.NewLine}" +
$"Span and Sequence readers diverged in same key exception:{Environment.NewLine}" +

Copilot uses AI. Check for mistakes.
Comment on lines +327 to +330
if (CompareExceptions(ex, ex))
{
return default!;
}
Copy link

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing the same exception with itself (ex, ex) will always return true for matching exceptions. This should likely compare different exceptions or handle the case differently.

Suggested change
if (CompareExceptions(ex, ex))
{
return default!;
}
// Handle the exception or log it as needed
// For now, rethrowing the exception to ensure proper handling
throw new InvalidOperationException("An error occurred during token comparison.", ex);

Copilot uses AI. Check for mistakes.
Comment on lines +66 to +67
//Debugger.Launch();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//Debugger.Launch();

// Create a random seed from the last 4 bytes of the input
int len = bytes.Length;
int randomSeed = bytes[len - 1] + (bytes[len - 2] << 8) + (bytes[len - 3] << 16) + (bytes[len - 4] << 24);
s_random = new Random(randomSeed);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting technique. I wonder if passing inputs through our own pseudorandom generator makes it more difficult for the fuzzing engine to guide code coverage. @MihaZupan is this considered good practice?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't pretend to understand how exactly it'd affect input generation, but I would expect it to struggle more, especially since the random generator itself isn't being instrumented.

AFAICT, you're only using a couple of random bytes per invocation, so it might be viable to interpret the first N bytes as the random data instead of as a seed.

static readonly Type[] groundTypes = [
typeof(int),
typeof(string),
typeof(decimal),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding the following types for better variety: double, long, Int128, char.

];

void TestRandomPoco(ReadOnlySpan<byte> bytes, ReadOnlySequence<byte> sequence, JsonSerializerOptions options)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider constructing random types recursively: each node can be randomly chosen to either be a ground type, a record, a poco with setters, a dictionary or list type. This should ensure maximal coverage for all built-in converters and JSON shapes. You can avoid extremely deep type graphs or stack overflows by imposing a maximum depth or randomly generated size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants