Skip to content

Properly handle parsing of large request documents#9133

Merged
tobias-tengler merged 2 commits intomainfrom
tte/handle-sequences-when-parsing-reqeusts
Feb 18, 2026
Merged

Properly handle parsing of large request documents#9133
tobias-tengler merged 2 commits intomainfrom
tte/handle-sequences-when-parsing-reqeusts

Conversation

@tobias-tengler
Copy link
Copy Markdown
Member

No description provided.

@tobias-tengler tobias-tengler marked this pull request as ready for review February 18, 2026 12:10
Copilot AI review requested due to automatic review settings February 18, 2026 12:10
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves handling of large GraphQL request documents by supporting multi-segment UTF-8 input (ReadOnlySequence<byte>) through the request parser and document hash providers, with added tests for these scenarios.

Changes:

  • Extend Utf8GraphQLRequestParser to parse query values provided as Utf8JsonReader.ValueSequence (multi-segment).
  • Add ReadOnlySequence<byte> hashing support to IDocumentHashProvider and concrete hash providers (MD5/SHA1/SHA256).
  • Add/expand tests for multi-segment parsing and hashing (including a large query case).

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/HotChocolate/Language/test/Language.Web.Tests/Utf8GraphQLRequestParserTests.cs Adds coverage for parsing large query values delivered as a multi-segment sequence.
src/HotChocolate/Language/test/Language.Web.Tests/Sha256DocumentHashProviderTests.cs Adds hashing tests for single- and multi-segment sequences (Base64/Hex).
src/HotChocolate/Language/test/Language.Web.Tests/Sha1DocumentHashProviderTests.cs Adds hashing tests for single- and multi-segment sequences (Base64/Hex).
src/HotChocolate/Language/test/Language.Web.Tests/MD5DocumentHashProviderTests.cs Adds hashing tests for single- and multi-segment sequences (Base64/Hex).
src/HotChocolate/Language/test/Language.Web.Tests/SequenceHelper.cs Introduces a helper to build multi-segment ReadOnlySequence<byte> instances for tests.
src/HotChocolate/Language/src/Language.Web/Utf8GraphQLRequestParser.cs Adds sequence-aware query extraction and a ParseDocument(ReadOnlySequence<byte>) overload.
src/HotChocolate/Language/src/Language.Web/Sha256DocumentHashProvider.cs Implements sequence hashing via incremental hashing.
src/HotChocolate/Language/src/Language.Web/Sha1DocumentHashProvider.cs Implements sequence hashing via incremental hashing.
src/HotChocolate/Language/src/Language.Web/MD5DocumentHashProvider.cs Implements sequence hashing via incremental hashing.
src/HotChocolate/Language/src/Language.Web/IDocumentHashProvider.cs Adds a new ComputeHash(ReadOnlySequence<byte>) API surface.
src/HotChocolate/Language/src/Language.Web/DocumentHashProviderBase.cs Adds base implementation for sequence hashing with framework-conditional behavior.
Comments suppressed due to low confidence (1)

src/HotChocolate/Language/src/Language.Web/Utf8GraphQLRequestParser.cs:329

  • Persisted-query detection only checks documentSpan.IsEmpty. When the query JSON string is delivered via Utf8JsonReader.ValueSequence (large / multi-segment), documentSpan stays empty even though a query was provided, so this block can incorrectly treat the request as a persisted-query request if extensions contains a hash. Update the condition to ensure both documentSpan and documentSequence are empty before attempting to extract a persisted-query hash.
        // Handle persisted queries via extensions
        if (documentSpan.IsEmpty
            && documentId.IsEmpty
            && _useCache
            && extensions is not null
            && TryExtractHash(extensions, out var hash))
        {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +53 to +59
var r = Assert.Single(batch);
Assert.Null(r.OperationName);
Assert.Null(r.DocumentId);
Assert.Null(r.Variables);
Assert.Null(r.Extensions);
r.Document.MatchSnapshot();
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test snapshot is extremely large (the generated Parse_Large_Query_Sequence.snap is ~23k lines). Very large snapshots slow down review/CI and are hard to maintain. Prefer asserting specific properties (e.g., selection count, a few representative field names, and/or that parsing succeeds without allocating a single segment) instead of snapshotting the entire parsed document.

Copilot uses AI. Check for mistakes.
Comment on lines 25 to +32
OperationDocumentHash ComputeHash(ReadOnlySpan<byte> document);

/// <summary>
/// Computes the hash of a GraphQL operation document.
/// </summary>
/// <param name="document">The GraphQL operation document.</param>
/// <returns>The hash of the GraphQL operation document.</returns>
OperationDocumentHash ComputeHash(ReadOnlySequence<byte> document);
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a new member to the public IDocumentHashProvider interface is a breaking change for any external consumers that implement this interface. If backwards compatibility is required, consider introducing a new interface (e.g., IDocumentHashProvider2) or providing sequence hashing via an extension/helper that falls back to the span overload for single-segment sequences.

Copilot uses AI. Check for mistakes.
Comment on lines +53 to +65
#if NETSTANDARD2_0
var length = checked((int)document.Length);
var rented = ArrayPool<byte>.Shared.Rent(length);

try
{
document.CopyTo(rented);
return ComputeHash(rented.AsSpan(0, length));
}
finally
{
ArrayPool<byte>.Shared.Return(rented);
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On NETSTANDARD2_0, the multi-segment ComputeHash(ReadOnlySequence<byte>) path rents/copies the sequence into rented, but then calls ComputeHash(rented.AsSpan(...)), which in turn rents and copies again in the span overload (because the span overload always copies into an ArrayPool buffer on NETSTANDARD2_0). This causes a double rent+copy for multi-segment sequences. Consider computing the hash directly from the first rented buffer by calling the ComputeHash(byte[] document, int length) abstraction and formatting it, to avoid the extra allocation/copy.

Copilot uses AI. Check for mistakes.
Comment on lines +437 to +446
if (isEscaped)
{
var length = checked((int)documentBody.Length);
var rented = s_bytePool.Rent(length);

try
{
documentBody.CopyTo(rented);
ParseDocument(rented.AsSpan(0, length), isEscaped, ref document, ref documentHash, ref documentId);
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the ReadOnlySequence<byte> overload, the escaped path rents a buffer to make the sequence contiguous and then calls the span overload with isEscaped: true, which causes the span overload to rent another buffer for unescaping. This doubles peak memory and copies for large escaped documents. Consider unescaping directly from the contiguous buffer (e.g., call the span overload with isEscaped: false after unescaping yourself) or refactor so the unescape buffer can be reused instead of renting twice.

Copilot uses AI. Check for mistakes.
@tobias-tengler tobias-tengler merged commit 1c9dd36 into main Feb 18, 2026
117 checks passed
@tobias-tengler tobias-tengler deleted the tte/handle-sequences-when-parsing-reqeusts branch February 18, 2026 12:16
@github-actions
Copy link
Copy Markdown
Contributor

Fusion Gateway Performance Results

Simple Composite Query

Req/s Err%
Constant (50 VUs) 2937.54 0.00%
Ramping (0-500-0 VUs) 3376.80 0.00%
Response Times & Query
Min Med Avg P90 P95 Max
Constant 0.79ms 14.99ms 16.80ms 30.95ms 36.16ms 173.32ms
Ramping 0.73ms 63.63ms 65.91ms 122.47ms 140.30ms 264.40ms
query TestQuery {
  topProducts(first: 5) {
    inStock
    name
    price
    shippingEstimate
    upc
    weight
    reviews {
      id
      body
      author {
        id
        username
        name
      }
    }
  }
}

Deep Recursion Query

Req/s Err%
Constant (50 VUs) 738.83 0.00%
Ramping (0-500-0 VUs) 829.48 0.00%
Response Times & Query
Min Med Avg P90 P95 Max
Constant 9.09ms 62.74ms 66.11ms 81.21ms 90.45ms 345.79ms
Ramping 1.84ms 251.75ms 259.80ms 514.76ms 547.16ms 683.83ms
query TestQuery {
  users {
    id
    username
    name
    reviews {
      id
      body
      product {
        inStock
        name
        price
        shippingEstimate
        upc
        weight
        reviews {
          id
          body
          author {
            id
            username
            name
            reviews {
              id
              body
              product {
                inStock
                name
                price
                shippingEstimate
                upc
                weight
              }
            }
          }
        }
      }
    }
  }
  topProducts(first: 5) {
    inStock
    name
    price
    shippingEstimate
    upc
    weight
    reviews {
      id
      body
      author {
        id
        username
        name
        reviews {
          id
          body
          product {
            inStock
            name
            price
            shippingEstimate
            upc
            weight
          }
        }
      }
    }
  }
}

Variable Batching Throughput

Req/s Err%
Constant (50 VUs) 23037.32 0.00%
Ramping (0-500-0 VUs) 18565.39 0.00%
Response Times & Query
Min Med Avg P90 P95 Max
Constant 0.09ms 1.76ms 2.12ms 4.03ms 4.91ms 52.75ms
Ramping 0.10ms 9.32ms 11.29ms 23.28ms 28.01ms 102.07ms
query TestQuery($upc: ID!, $price: Long!, $weight: Long!) {
  productByUpc(upc: $upc) {
    inStock
    shippingEstimate(weight: $weight, price: $price)
  }
}

Variables (5 sets batched per request)

[
  { "upc": "1", "price": 899, "weight": 100 },
  { "upc": "2", "price": 1299, "weight": 1000 },
  { "upc": "3", "price": 15, "weight": 20 },
  { "upc": "4", "price": 499, "weight": 100 },
  { "upc": "5", "price": 1299, "weight": 1000 }
]

Run 22139169896 • Commit e8f0cb2 • Wed, 18 Feb 2026 12:39:04 GMT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants