Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughTreat client disconnects ( Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
Router-nonroot image scan passed✅ No security vulnerabilities found in image: |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2741 +/- ##
==========================================
+ Coverage 65.62% 65.99% +0.37%
==========================================
Files 254 254
Lines 26401 33221 +6820
==========================================
+ Hits 17325 21924 +4599
- Misses 7670 9908 +2238
+ Partials 1406 1389 -17
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
router/core/batch.go (1)
183-212:⚠️ Potential issue | 🟠 MajorReturn immediately for canceled batch requests.
This branch still falls through to
writeRequestErrors, so a batched disconnect will try to serialize a GraphQL error after we've already classified the failure as a client-side cancellation. That can reintroduce write-side noise and mutate the observed status in the batch path.💡 Proposed fix
func processBatchError(w http.ResponseWriter, r *http.Request, err error, requestLogger *zap.Logger) { - if errors.Is(err, context.Canceled) { - span := trace.SpanFromContext(r.Context()) - span.RecordError(err) - } else { - ctrace.AttachErrToSpanFromContext(r.Context(), err) - } + if errors.Is(err, context.Canceled) { + trace.SpanFromContext(r.Context()).RecordError(err) + return + } + + ctrace.AttachErrToSpanFromContext(r.Context(), err) requestError := graphqlerrors.RequestError{🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@router/core/batch.go` around lines 183 - 212, In processBatchError, the context.Canceled branch records the span but then falls through to writeRequestErrors; modify processBatchError so that when errors.Is(err, context.Canceled) is true you RecordError on the span (as now) and then immediately return to avoid calling writeRequestErrors and serializing a GraphQL error for client cancellations; keep the existing else branch (ctrace.AttachErrToSpanFromContext) and the subsequent handling for non-canceled errors intact.
🧹 Nitpick comments (2)
router/core/engine_loader_hooks_test.go (2)
373-392: Assert the merged attrs at the metric-store boundary too.Right now this only checks the returned slice. The test would still pass if
recordFetchErrorreturned the merged attrs but calledMeasureRequestErrorwith the old slice.Suggested assertion
resultSlice, _ := hooks.recordFetchError(ctx, span, fetchErr, rc, nil, metricAddOpt, prePopulated) span.End() @@ require.True(t, hasExisting, "pre-populated attrs should be preserved") require.True(t, hasErrorCodes, "error codes should be appended") + + require.True(t, store.requestErrorCalled, "MeasureRequestError should be called") + + var metricHasExisting, metricHasErrorCodes bool + for _, attr := range store.requestErrorSliceAttr { + if string(attr.Key) == "existing.attr" { + metricHasExisting = true + } + if string(attr.Key) == "graphql.error.codes" { + metricHasErrorCodes = true + } + } + require.True(t, metricHasExisting, + "MeasureRequestError should receive the pre-populated attrs") + require.True(t, metricHasErrorCodes, + "MeasureRequestError should receive the appended error codes")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@router/core/engine_loader_hooks_test.go` around lines 373 - 392, The test currently only asserts the returned slice from hooks.recordFetchError contains merged attributes; extend it to also verify that the metric-store call received the merged slice: capture the arguments passed to the mock MetricStore.MeasureRequestError (or the equivalent MeasureRequestError spy used by hooks), and assert that the captured attrs include both the prePopulated attribute ("existing.attr") and the "graphql.error.codes" attribute (same checks used for resultSlice). Update references to the mock/spy used by the hooks setup (e.g., the MetricStore mock instance) and reuse metricAddOpt/prePopulated/resultSlice names to ensure MeasureRequestError was invoked with the merged attrs.
107-130: Assert the exception event for wrapped cancellations too.This case currently only proves the status/metric behavior. If the wrapped
context.Canceledpath stops emitting the observability event, the regression would still pass.Suggested assertion
require.NotEqual(t, codes.Error, spans[0].Status().Code, "wrapped context.Canceled should not set span status to Error") + require.Len(t, spans[0].Events(), 1, + "wrapped context.Canceled should still be recorded as a span event") + require.Equal(t, "exception", spans[0].Events()[0].Name) require.False(t, store.requestErrorCalled, "MeasureRequestError should not be called for wrapped context.Canceled")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@router/core/engine_loader_hooks_test.go` around lines 107 - 130, The test "wrapped context.Canceled does not set span ERROR status" currently omits asserting that an observability exception event is still emitted for wrapped cancellations; update the test after calling hooks.OnFinished(ctx, ds, &resolve.ResponseInfo{Err: wrappedErr}) to inspect the recorded span (spans[0]) events and assert that there is an "exception" event (or an event whose attributes indicate an exception/exception.type/exception.message matching wrappedErr) so the wrapped context.Canceled path still emits the expected exception event; use the existing exporter.GetSpans().Snapshots() and spans[0].Events() APIs to locate and assert the event.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@router-tests/telemetry/span_error_status_test.go`:
- Around line 188-191: Replace the fixed time.Sleep usage with
require.Eventually to avoid races when waiting for async span/log export:
instead of sleeping then calling exporter.GetSpans().Snapshots() and
require.NotEmpty(t, spans), call require.Eventually with a short polling
interval and timeout and inside the poll invoke exporter.GetSpans().Snapshots()
and assert len(spans) > 0 (or use require.NotEmpty within the closure) so the
test waits until all expected spans are exported; update the same pattern found
around the other occurrences that use time.Sleep before checking
exporter.GetSpans().Snapshots() (the instances at the other mentioned blocks
should be changed similarly).
In `@router/core/engine_loader_hooks.go`:
- Around line 274-331: The recordFetchError helper currently records the error
and metrics but never sets the span attribute that marks non-canceled request
failures; update recordFetchError to set the span attribute
rotel.WgRequestError.Bool(true) for real fetch failures (i.e., when fetchErr is
not context.Canceled and not context.DeadlineExceeded or equivalent cancellation
checks) before returning so traces and metrics stay in sync; reference the
recordFetchError function and rotel.WgRequestError to locate where to add
span.SetAttributes(...) alongside the existing metricAttrs append and
otelmetric.WithAttributeSet creation.
---
Outside diff comments:
In `@router/core/batch.go`:
- Around line 183-212: In processBatchError, the context.Canceled branch records
the span but then falls through to writeRequestErrors; modify processBatchError
so that when errors.Is(err, context.Canceled) is true you RecordError on the
span (as now) and then immediately return to avoid calling writeRequestErrors
and serializing a GraphQL error for client cancellations; keep the existing else
branch (ctrace.AttachErrToSpanFromContext) and the subsequent handling for
non-canceled errors intact.
---
Nitpick comments:
In `@router/core/engine_loader_hooks_test.go`:
- Around line 373-392: The test currently only asserts the returned slice from
hooks.recordFetchError contains merged attributes; extend it to also verify that
the metric-store call received the merged slice: capture the arguments passed to
the mock MetricStore.MeasureRequestError (or the equivalent MeasureRequestError
spy used by hooks), and assert that the captured attrs include both the
prePopulated attribute ("existing.attr") and the "graphql.error.codes" attribute
(same checks used for resultSlice). Update references to the mock/spy used by
the hooks setup (e.g., the MetricStore mock instance) and reuse
metricAddOpt/prePopulated/resultSlice names to ensure MeasureRequestError was
invoked with the merged attrs.
- Around line 107-130: The test "wrapped context.Canceled does not set span
ERROR status" currently omits asserting that an observability exception event is
still emitted for wrapped cancellations; update the test after calling
hooks.OnFinished(ctx, ds, &resolve.ResponseInfo{Err: wrappedErr}) to inspect the
recorded span (spans[0]) events and assert that there is an "exception" event
(or an event whose attributes indicate an
exception/exception.type/exception.message matching wrappedErr) so the wrapped
context.Canceled path still emits the expected exception event; use the existing
exporter.GetSpans().Snapshots() and spans[0].Events() APIs to locate and assert
the event.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 95ba86f9-0380-4da8-8e23-5351e8b8d117
📒 Files selected for processing (9)
router-tests/telemetry/span_error_status_test.gorouter/core/batch.gorouter/core/engine_loader_hooks.gorouter/core/engine_loader_hooks_test.gorouter/core/errors.gorouter/core/graphql_prehandler.gorouter/core/operation_metrics_test.gorouter/pkg/trace/transport.gorouter/pkg/trace/transport_test.go
855b64e to
1371973
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
router-tests/telemetry/span_error_status_test.go (1)
190-199:⚠️ Potential issue | 🟡 Minor
require.Eventuallystill waits on partial export conditions.These waits can pass before all relevant spans are exported, which can make the later “no span has
codes.Error” assertions flaky or incomplete under slower export cycles. Please wait for a complete/settled snapshot before asserting across all spans.As per coding guidelines, "For periodic exporters, wait for ALL expected items using require.Eventually, not just one sentinel value, to avoid race conditions with export cycles".
Also applies to: 294-303, 363-366
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@router-tests/telemetry/span_error_status_test.go` around lines 190 - 199, The test uses require.Eventually with a sentinel that returns true once a single "Engine - Fetch" span appears, which can pass before all spans are exported and causes flakiness; update the require.Eventually predicate that calls exporter.GetSpans().Snapshots() so it waits for a complete/settled snapshot (e.g., check that the total number of exported spans equals the expected count or that no more new snapshots appear for a short interval) before returning true, and then perform the "no span has codes.Error" assertions against that settled snapshot; apply the same change to the other occurrences using require.Eventually + exporter.GetSpans().Snapshots() in this file (the blocks looking for "Engine - Fetch" and later assertions).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@router-tests/telemetry/span_error_status_test.go`:
- Around line 248-257: The handler currently slices the Authorization header
with authorization[len("Bearer "):], which can panic for short or non-"Bearer "
values; replace that unsafe slice in both occurrences with a safe prefix check
using strings.CutPrefix (or strings.HasPrefix + slicing after verifying length):
check that authorization is non-empty and that strings.CutPrefix(authorization,
"Bearer ") returns token and true, otherwise respond http.StatusUnauthorized;
then pass the safe token into jwtParser.ParseUnverified (the
jwtParser.ParseUnverified call and parsedClaims usage remain unchanged). Also
add an import for strings if missing.
---
Duplicate comments:
In `@router-tests/telemetry/span_error_status_test.go`:
- Around line 190-199: The test uses require.Eventually with a sentinel that
returns true once a single "Engine - Fetch" span appears, which can pass before
all spans are exported and causes flakiness; update the require.Eventually
predicate that calls exporter.GetSpans().Snapshots() so it waits for a
complete/settled snapshot (e.g., check that the total number of exported spans
equals the expected count or that no more new snapshots appear for a short
interval) before returning true, and then perform the "no span has codes.Error"
assertions against that settled snapshot; apply the same change to the other
occurrences using require.Eventually + exporter.GetSpans().Snapshots() in this
file (the blocks looking for "Engine - Fetch" and later assertions).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8780399f-b618-403a-acca-7d43441579a6
📒 Files selected for processing (3)
router-tests/telemetry/span_error_status_test.gorouter/core/engine_loader_hooks.gorouter/core/engine_loader_hooks_test.go
✅ Files skipped from review due to trivial changes (1)
- router/core/engine_loader_hooks_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- router/core/engine_loader_hooks.go
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
router-tests/telemetry/span_error_status_test.go (1)
248-257:⚠️ Potential issue | 🟠 MajorGuard Bearer token parsing to avoid panic on malformed Authorization headers.
Line 253 slices
authorization[len("Bearer "):]without validating the prefix, which can panic for short/non-Bearer values in this mock handler.Proposed fix
authorization := r.Header.Get("Authorization") - if authorization == "" { + token, ok := strings.CutPrefix(authorization, "Bearer ") + if !ok || token == "" { w.WriteHeader(http.StatusUnauthorized) return } - token := authorization[len("Bearer "):] parsedClaims := make(jwt.MapClaims) jwtParser := new(jwt.Parser) _, _, err := jwtParser.ParseUnverified(token, parsedClaims)#!/bin/bash # Verify unsafe bearer slicing is removed and safe prefix parsing is used. rg -nP --type go -C2 '\[\s*len\("Bearer "\)\s*:\s*\]' router-tests/telemetry/span_error_status_test.go router-tests/testenv/testenv.go rg -n --type go -C2 'strings\.CutPrefix\(\s*authorization,\s*"Bearer "\s*\)' router-tests/telemetry/span_error_status_test.go router-tests/testenv/testenv.go🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@router-tests/telemetry/span_error_status_test.go` around lines 248 - 257, The handler currently slices authorization[len("Bearer "):] unsafely which can panic on short/malformed Authorization headers; update the authorization parsing in the test handler to safely verify the prefix (e.g., use strings.HasPrefix or strings.CutPrefix on the authorization variable) before extracting token, return http.StatusUnauthorized if the prefix is missing or token is empty, and then proceed to call jwtParser.ParseUnverified with the safe token and parsedClaims (refer to the authorization, token, jwtParser, and parsedClaims symbols to locate the code).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@router-tests/telemetry/span_error_status_test.go`:
- Around line 190-199: The test currently returns from require.Eventually as
soon as a single sentinel span ("Engine - Fetch") appears which can miss
late-exported spans; update the polling lambda used with require.Eventually (the
one calling exporter.GetSpans().Snapshots()) to wait until the full set of
expected spans is present (e.g., verify the snapshot contains all required span
names or a minimum expected length) before returning true, and apply the same
change to the other two polling sites around lines where fetchSpan is set so
subsequent assertions operate on a complete snapshot.
---
Duplicate comments:
In `@router-tests/telemetry/span_error_status_test.go`:
- Around line 248-257: The handler currently slices authorization[len("Bearer
"):] unsafely which can panic on short/malformed Authorization headers; update
the authorization parsing in the test handler to safely verify the prefix (e.g.,
use strings.HasPrefix or strings.CutPrefix on the authorization variable) before
extracting token, return http.StatusUnauthorized if the prefix is missing or
token is empty, and then proceed to call jwtParser.ParseUnverified with the safe
token and parsedClaims (refer to the authorization, token, jwtParser, and
parsedClaims symbols to locate the code).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 172711aa-acf2-4b8c-980e-96931bb2bba6
📒 Files selected for processing (3)
router-tests/telemetry/span_error_status_test.gorouter/core/engine_loader_hooks.gorouter/core/engine_loader_hooks_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- router/core/engine_loader_hooks_test.go
- router/core/engine_loader_hooks.go
There was a problem hiding this comment.
🧹 Nitpick comments (1)
router-tests/telemetry/span_error_status_test.go (1)
239-245: Consider clarifying the comment to match the assertion behavior.The comment states the attribute "should not be set" but the code only verifies it's not
truewhen present. Both achieve the test's goal, but the comment could be more precise.📝 Suggested comment clarification
- // The wg.request.error attribute should not be set on the fetch span + // The wg.request.error attribute should not be true on the fetch span for _, attr := range fetchSpan.Attributes() { if attr.Key == "wg.request.error" { require.False(t, attr.Value.AsBool(), "wg.request.error should not be true on fetch span for client disconnects") } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@router-tests/telemetry/span_error_status_test.go` around lines 239 - 245, Update the comment above the loop over fetchSpan.Attributes() to accurately describe the test: instead of saying "The wg.request.error attribute should not be set on the fetch span", state that the test verifies the attribute is not true when present (e.g., "If wg.request.error is present on the fetch span, it should not be true"); refer to the loop checking fetchSpan.Attributes() and the assertion that attr.Value.AsBool() is false for attr.Key == "wg.request.error".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@router-tests/telemetry/span_error_status_test.go`:
- Around line 239-245: Update the comment above the loop over
fetchSpan.Attributes() to accurately describe the test: instead of saying "The
wg.request.error attribute should not be set on the fetch span", state that the
test verifies the attribute is not true when present (e.g., "If wg.request.error
is present on the fetch span, it should not be true"); refer to the loop
checking fetchSpan.Attributes() and the assertion that attr.Value.AsBool() is
false for attr.Key == "wg.request.error".
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9d2f337d-5673-4e3d-b143-def26637e727
📒 Files selected for processing (3)
router-tests/telemetry/span_error_status_test.gorouter-tests/testenv/testenv.gorouter/core/errors.go
🚧 Files skipped from review as they are similar to previous changes (1)
- router/core/errors.go
…dnt-result-in-a-fetch-error
There was a problem hiding this comment.
🧹 Nitpick comments (1)
router-tests/telemetry/telemetry_test.go (1)
7740-7740: Assertwg.request.error=trueexplicitly (not only attribute count).Line 7740 validates only total attribute count. Please also assert
otel.WgRequestError.Bool(true)here so the test fails on semantic regressions, not just count changes.Suggested patch
- require.Lenf(t, sn[8].Attributes(), 15, "expected 15 attributes, got %d", len(sn[8].Attributes())) + require.Lenf(t, sn[8].Attributes(), 15, "expected 15 attributes, got %d", len(sn[8].Attributes())) + require.Contains(t, sn[8].Attributes(), otel.WgRequestError.Bool(true))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@router-tests/telemetry/telemetry_test.go` at line 7740, The test currently only asserts the attribute count for span sn[8]; additionally assert that the span contains the "wg.request.error" attribute set to true by checking otel.WgRequestError.Bool(true) against sn[8].Attributes() (e.g., use require.True or require.Equal to assert the attribute exists and is true for sn[8]). Locate the assertion around require.Lenf(t, sn[8].Attributes(), 15, ...) and add a line that explicitly verifies otel.WgRequestError.Bool(true) is present/true on sn[8] to catch semantic regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@router-tests/telemetry/telemetry_test.go`:
- Line 7740: The test currently only asserts the attribute count for span sn[8];
additionally assert that the span contains the "wg.request.error" attribute set
to true by checking otel.WgRequestError.Bool(true) against sn[8].Attributes()
(e.g., use require.True or require.Equal to assert the attribute exists and is
true for sn[8]). Locate the assertion around require.Lenf(t, sn[8].Attributes(),
15, ...) and add a line that explicitly verifies otel.WgRequestError.Bool(true)
is present/true on sn[8] to catch semantic regressions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5a3fc53e-1e24-4dbb-a51e-c8500c540306
📒 Files selected for processing (1)
router-tests/telemetry/telemetry_test.go
…dnt-result-in-a-fetch-error
This PR fixes places where we mark spans as errors for client disconnects incorrectly.
Summary by CodeRabbit
Bug Fixes
Refactor
Tests
Checklist
Open Source AI Manifesto
This project follows the principles of the Open Source AI Manifesto. Please ensure your contribution aligns with its principles.