feat: ability to add operation info as trace or telemetry attributes#2252
feat: ability to add operation info as trace or telemetry attributes#2252
Conversation
WalkthroughGroup expression evaluation into prioritized buckets, centralize telemetry attribute resolution, extend request.operation with sha256/persisted ID and timing fields, add visitors to detect operation-related expression usage, update call sites to use bucketed expressions, and add extensive tests for logging, telemetry, visitors, and blocking by parsingTime. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Comment |
❌ Internal Query Planner CI checks failedThe Internal Query Planner CI checks failed in the celestial repository, and this is going to stop the merge of this PR. |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
router/internal/expr/expr.go (1)
81-91: Consider grouping related fields for better readability.The Operation struct now has 9 fields, but they're interleaved rather than grouped by category. Consider organizing them logically:
- Identifiers: Name, Type, Hash, Sha256Hash, PersistedId
- Timings: ParsingTime, NormalizationTime, ValidationTime, PlanningTime
This would improve maintainability as the struct grows.
type Operation struct { + // Identifiers + Name string `expr:"name"` + Type string `expr:"type"` + Hash string `expr:"hash"` Sha256Hash string `expr:"sha256Hash"` + PersistedId string `expr:"persistedId"` + // Timings ParsingTime time.Duration `expr:"parsingTime"` - Name string `expr:"name"` - Type string `expr:"type"` - PersistedId string `expr:"persistedId"` NormalizationTime time.Duration `expr:"normalizationTime"` - Hash string `expr:"hash"` ValidationTime time.Duration `expr:"validationTime"` PlanningTime time.Duration `expr:"planningTime"` }router/internal/expr/request_operation_bucket_visitor.go (4)
57-66: Deduplicate property-name extraction via a small helper.Cuts repetition and reduces chances of subtle bugs with additional property node types.
@@ +// propNameOf returns identifier/string property names; otherwise "". +func propNameOf(n ast.Node) string { + switch p := n.(type) { + case *ast.StringNode: + return p.Value + case *ast.IdentifierNode: + return p.Value + default: + return "" + } +} @@ - { - prop := "" - switch p := member.Property.(type) { - case *ast.StringNode: - prop = p.Value - case *ast.IdentifierNode: - prop = p.Value - } - if prop == "auth" { + { + if propNameOf(member.Property) == "auth" { if reqIdent, ok := member.Node.(*ast.IdentifierNode); ok && reqIdent.Value == ExprRequestKey { v.setBucketIfHigher(BucketAuth) // no return; higher-priority matches may exist in other nodes } } } @@ - propName := "" - switch p := member.Property.(type) { - case *ast.StringNode: - propName = p.Value - case *ast.IdentifierNode: - propName = p.Value - default: - propName = "" - } - if propName == "" { + prop := propNameOf(member.Property) + if prop == "" { return } @@ - opProp := "" - switch op := opMember.Property.(type) { - case *ast.StringNode: - opProp = op.Value - case *ast.IdentifierNode: - opProp = op.Value - } + opProp := propNameOf(opMember.Property) @@ - switch propName { + switch prop {Also applies to: 76-85, 95-100
108-126: Consider handling unknown request.operation properties explicitly.A default branch (e.g., leave as BucketDefault or emit trace‑level log) can aid debugging when new props appear.
1-129: Add focused tests for bucketing (table‑driven).Cover: subgraph.name, request.auth, and each request.operation property (sha256Hash, parsingTime, name, type, persistedId, normalizationTime, hash, validationTime, planningTime). Assert highest bucket wins when multiple occur.
I can scaffold a test that parses expressions with expr, walks the AST with this visitor, and asserts Bucket.
10-22: Follow Go initialism conventions for exported buckets• Rename in router/internal/expr/request_operation_bucket_visitor.go:
– consts: BucketSha256→BucketSHA256, BucketPersistedId→BucketPersistedID
– switch-case: v.setBucketIfHigher(BucketSha256/BucketPersistedId)→v.setBucketIfHigher(BucketSHA256/BucketPersistedID)• Update all usages, e.g., in router/core/graphql_prehandler.go at the setTelemetryAttributes calls (expr.BucketSha256/BucketPersistedId) to use the new names.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
router-tests/block_operations_test.go(1 hunks)router-tests/structured_logging_test.go(1 hunks)router/core/attribute_expressions.go(3 hunks)router/core/attribute_expressions_test.go(2 hunks)router/core/engine_loader_hooks.go(2 hunks)router/core/graphql_prehandler.go(22 hunks)router/core/transport.go(2 hunks)router/core/websocket.go(4 hunks)router/internal/expr/expr.go(1 hunks)router/internal/expr/request_operation_bucket_visitor.go(1 hunks)router/internal/expr/use_request_operation_sha256.go(1 hunks)router/internal/expr/uses_request_operation_sha256_test.go(1 hunks)router/internal/expr/visitor_group.go(3 hunks)router/internal/expr/visitor_group_test.go(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
PR: wundergraph/cosmo#2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the query body before any APQ operations occur.
Applied to files:
router/core/graphql_prehandler.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
PR: wundergraph/cosmo#2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function at lines 571-578, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the computed query hash before any APQ operations occur. There's also a test case that verifies this behavior.
Applied to files:
router/core/graphql_prehandler.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: integration_test (./events)
- GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
🔇 Additional comments (45)
router/core/attribute_expressions.go (6)
17-20: LGTM: Clean wrapper struct.The
ProgramWithKeystruct appropriately pairs a compiled expression program with its attribute key, enabling the bucket-based architecture.
76-96: LGTM: Bucket-based attribute resolution.The method correctly resolves attributes for a specific bucket by iterating over
ProgramWithKeyentries and resolving values via the wrappedProgram. The error handling is appropriate.
98-105: LGTM: Clean options struct.The
AddExprOptsstruct encapsulates all necessary context for attribute addition, supporting the unified telemetry path.
136-151: LGTM: Unified attribute addition with proper nil checks.The
addExpressionshelper correctly:
- Handles nil expressions early
- Resolves attributes for the specified bucket
- Applies attributes via the provided function
- Optionally updates the span
The error logging is appropriate.
107-134: No changes needed. Nil safety is ensured by theif opts.expressions == nil { return }guard, eachaddExpressionscall correctly handles its own errors without blocking the others, and using a singlecurrSpanfor both telemetry and tracing (while omitting it for metrics) is intentional.
52-74: Verified RequestOperationBucketVisitor behavior and error formatting
RequestOperationBucketVisitor correctly maps auth, sha256Hash, parsingTime, name/type, persistedId, normalizationTime, hash, validationTime, planningTime, and subgraph to their respective buckets in the intended priority order, and compilation failures produce clear messages including the attribute key, the expression, and the underlying error.router/core/attribute_expressions_test.go (1)
87-148: LGTM: Test correctly validates bucket-based expressions.The test appropriately:
- Creates attributes with and without auth access
- Verifies they're bucketed correctly (BucketDefault for "attr1", BucketAuth for "attr2")
- Resolves attributes from each bucket separately
- Validates the resolved values
The use of
assert.Conditionwith inline iteration is clear and effective.router-tests/block_operations_test.go (1)
126-147: Verified parsingTime population before condition evaluation parsingTime is set immediately after parsing in both HTTP (graphql_prehandler.go:588-589,601-602) and WebSocket handlers (websocket.go:887,891), ensuring the block condition sees a populatedparsingTime.router/internal/expr/visitor_group_test.go (1)
67-110: LGTM! Well-structured test following established patterns.The test correctly mirrors the existing
IsResponseBodyUsedInExpressionspattern and covers the key scenarios:
- Negative case (non-sha256 operation field access)
- Dot-chaining access
- Square-bracket access
- Mixed access
The parallel execution and assertions are appropriate.
router/internal/expr/uses_request_operation_sha256_test.go (1)
1-113: LGTM! Comprehensive test coverage for the new visitor.The test suite thoroughly covers:
- Nil handling
- All valid access patterns (dot, bracket, mixed)
- Negative cases (wrong property, wrong root, wrong middle segment)
- Short-circuit optimization
The test structure is clean and assertion choices are appropriate.
router/internal/expr/visitor_group.go (3)
12-12: LGTM! Constant addition follows established pattern.The new
usesRequestOperationSha256Keyconstant is correctly added to thevisitorKindiota sequence.
26-31: LGTM! Visitor correctly registered.The
UsesRequestOperationSha256visitor is properly initialized in theglobalVisitorsmap with consistent formatting.
67-73: LGTM! Public accessor follows established pattern.The new
IsRequestOperationSha256UsedInExpressions()method correctly mirrors the existing accessor pattern, including:
- Nil receiver handling (safe default: true)
- Type assertion on the visitor
- Field access to return the boolean flag
router/core/engine_loader_hooks.go (2)
7-7: LGTM! Import correctly added for bucket constant.The new import is necessary for the
expr.BucketSubgraphconstant used in the refactored expression handling.
176-205: No changes required for metric expressions
TheaddExpressionsfunction guards onopts.currSpan != nilbefore setting span attributes, so omittingcurrSpanin the metrics-only call is safe and intentional.router/internal/expr/use_request_operation_sha256.go (1)
7-63: LGTM! Well-structured visitor implementation.The visitor correctly detects
request.operation.sha256Hashreferences in the AST by:
- Early-exiting when the node is nil or the flag is already set (performance optimization)
- Validating the three-level structure: sha256Hash property → operation member → request identifier
- Handling both string and identifier forms of property access
- Using type switches defensively to ensure correct node types
The logic is clear, correct, and integrates well with the broader expression evaluation framework introduced in this PR.
router/core/websocket.go (5)
876-879: LGTM! Centralized context retrieval.The early retrieval of
reqCtxand validation ensures it's available for timing updates throughout the function, reducing redundant retrievals and improving clarity.
902-922: LGTM! Comprehensive normalization timing on all paths.Normalization timing is consistently updated across all four outcomes:
- After NormalizeOperation error (line 906)
- After NormalizeVariables error (line 914)
- After RemapVariables error (line 920)
- On success (line 932)
This ensures accurate timing metrics regardless of where normalization fails.
937-953: LGTM! Validation timing captured on both paths.Validation timing is correctly updated after
ValidateQueryComplexityerror (line 942) and afterValidateerror (line 948), as well as on success (line 953).
955-965: LGTM! Planning timing captured consistently.Planning timing is correctly updated on both error (line 960) and success (line 965) paths.
884-893: Verify ParsingTime initialization on skipParse path.No
skipParseoccurrences were found in the codebase—please manually confirm that when parsing is skipped,reqCtx.expressionContext.Request.Operation.ParsingTimeis explicitly initialized (e.g., to zero) to prevent downstream assumptions.router-tests/structured_logging_test.go (7)
1999-2035: LGTM! Test validates sha256Hash with persisted operation.The test correctly validates that
request.operation.sha256Hashexpression resolves to the SHA-256 hash of an empty string (e3b0c44...) when a persisted query is used (no body), which aligns with the expected behavior.
2037-2071: LGTM! Test validates sha256Hash without persisted operation.The test correctly validates that
request.operation.sha256Hashcomputes the hash of the query body when no persisted operation is provided. The expected hashc13e0fa...matches the SHA-256 ofquery employees { employees { id } }.
2073-2110: LGTM! Test validates persistedId expression.The test correctly validates that
request.operation.persistedIdresolves to the persisted query SHA-256 hash (dc67510...) from the extensions.
2112-2139: LGTM! Test validates parsingTime is non-zero.The test correctly validates that
request.operation.parsingTimeyields a non-zero duration, confirming that parsing timing is captured and exposed via expressions.
2141-2168: LGTM! Test validates normalizationTime is non-zero.The test correctly validates that
request.operation.normalizationTimeyields a non-zero duration, confirming that normalization timing is captured and exposed via expressions.
2170-2197: LGTM! Test validates validationTime is non-zero.The test correctly validates that
request.operation.validationTimeyields a non-zero duration, confirming that validation timing is captured and exposed via expressions.
2199-2226: LGTM! Test validates planningTime is non-zero.The test correctly validates that
request.operation.planningTimeyields a non-zero duration, confirming that planning timing is captured and exposed via expressions.router/core/transport.go (2)
138-138: LGTM! Bucket parameter added for expression resolution.The addition of
expr.BucketDefaulttoexpressionsAttributesaligns with the new bucketed expression system introduced in this PR.
490-508: LGTM! Centralized expression resolution improves maintainability.The refactor replaces inline conditional expression resolution with the unified
addExpressionsflow. ThespanAttrFuncclosure cleanly captures attributes, and usingBucketSubgraphfor both telemetry and tracing expressions in the GRPC context is appropriate.router/core/graphql_prehandler.go (12)
211-211: LGTM! Initial telemetry attributes set early in request flow.Setting telemetry attributes with
BucketDefaultearly in the handler ensures base attributes are captured before operation-specific details are available.
358-358: LGTM! Telemetry attributes refreshed after authentication.Using
BucketAuthafter authentication allows expressions to access auth-related context for telemetry/tracing.
420-422: LGTM! Conditional sha256 computation based on expression usage.The new check
IsRequestOperationSha256UsedInExpressions()ensures the expensive sha256 computation is performed when expressions referencerequest.operation.sha256Hash, improving performance when not needed.
505-507: LGTM! Sha256 hash propagated to expression context.The sha256 hash is correctly assigned to both
requestContext.operation.sha256HashandrequestContext.expressionContext.Request.Operation.Sha256Hash, and telemetry attributes are updated with the appropriateBucketSha256.
615-618: LGTM! Operation name and type propagated correctly.The operation name and type are assigned to the expression context and telemetry attributes are updated with
BucketNameOrType. This follows the pattern established for other operation fields.
672-673: LGTM! Persisted ID propagated to expression context.The persisted operation ID is correctly assigned and telemetry attributes are updated with
BucketPersistedId.
858-859: LGTM! Operation hash set after normalization.The operation hash is assigned to the expression context after successful normalization, and telemetry attributes are updated with
BucketHash. The placement after normalization is correct since the hash is computed during that phase.
966-1003: LGTM! Planning time set on both success and error paths.Planning time is correctly assigned to
expressionContext.Request.Operation.PlanningTimeon both the error path (lines 988-989) and success path (lines 1002-1003), with telemetry attributes updated usingBucketPlanningTime. TheplanCtxis properly created fromreq.Context()at line 966.
1169-1184: LGTM! Client context properly updated with individual field assignments.The
setExpressionContextClientfunction now assigns client name and version to individual fields rather than replacing the entire Client structure. This approach is cleaner and avoids potential issues with uninitialized fields.
576-603: parseCtx usage verified; no changes required.
890-947: Validation timing paths correctly implemented
validationCtxis initialized fromreq.Context()at line 890, and both query-depth and validation error branches recordvalidationTimebefore ending the span.
692-779: Normalization timing is recorded consistently on all pathsAll three error exits (lines 702–703, 734–735, 778–779) and the successful path (lines 854–856) assign
operation.normalizationTime, updateexpressionContext.Request.Operation.NormalizationTime, and invokesetTelemetryAttributes.router/internal/expr/request_operation_bucket_visitor.go (3)
37-56: LGTM: traversal/early‑exit logic.Nil guard and short‑circuit on BucketSubgraph look good; subgraph detection via Identifier/Member is clear.
74-107: LGTM: request.operation. recognition and guarding.*Parent checks ensure we only bucket true request.operation properties.
25-26: No action required: bucket ordering matches intended scheduleAttributeBucket enum confirms the intended evaluation sequence—auth < sha256 < parsingTime < name/type < persistedId < normalizationTime < hash < validationTime < planningTime < subgraph.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
router/core/graphql_prehandler.go (1)
867-869: Set span attributes before ending the span.You call engineNormalizeSpan.End() and then set attributes on it. Attributes set after End are dropped.
- engineNormalizeSpan.End() - - if operationKit.parsedOperation.IsPersistedOperation { - engineNormalizeSpan.SetAttributes(otel.WgEnginePersistedOperationCacheHit.Bool(operationKit.parsedOperation.PersistedOperationCacheHit)) - } + if operationKit.parsedOperation.IsPersistedOperation { + engineNormalizeSpan.SetAttributes(otel.WgEnginePersistedOperationCacheHit.Bool(operationKit.parsedOperation.PersistedOperationCacheHit)) + } + engineNormalizeSpan.End()Also applies to: 865-866
🧹 Nitpick comments (8)
router-tests/structured_logging_test.go (5)
1999-2001: Rename to reflect scenario (no body present).The test name says “with persisted hash and body” but the request includes only a persisted hash (no query body). Rename to avoid confusion or include a body if you intend to test the hash-vs-body validation path.
- t.Run("validate request.operation.sha256Hash expression with persisted hash and body", func(t *testing.T) { + t.Run("validate request.operation.sha256Hash expression with persisted hash (no body)", func(t *testing.T) {
2135-2137: Prefer direct duration comparisons.Use require.Greater(t, val, time.Duration(0)) instead of casting to int to avoid needless conversions and potential 32-bit overflow pitfalls.
- val, ok := requestContext["parsing_time_expression"].(time.Duration) + val, ok := requestContext["parsing_time_expression"].(time.Duration) require.True(t, ok) - require.Greater(t, int(val), 0) + require.Greater(t, val, time.Duration(0))Apply the same pattern for normalizationTime, validationTime, planningTime, fetchDuration, and connAcquireDuration assertions in these blocks.
Also applies to: 2164-2166, 2193-2195, 2222-2224, 3127-3130, 3162-3165, 3209-3211, 3240-3244, 3277-3279, 3322-3325
3157-3165: Avoid index-based assumptions for subgraph log order.Relying on requestLog.All()[0]/[1] can flake under concurrency. Select entries by stable fields (e.g., subgraph_name or subgraph_id) instead of array position.
- employeeSubgraphLogs := requestLogAll[0] + employeeSubgraphLogs := findBy(requestLogAll, "subgraph_name", "employees")If helpful, I can provide a small helper to search ObservedLogs by context key.
Also applies to: 3208-3212
2771-2773: Brittle span-count assertions.Asserting an exact span count (9, 0) is fragile across innocuous instrumentation changes. Prefer asserting sampled flag correctness and that “at least one span exists” (or use >= thresholds).
- require.Len(t, sn, 9) + require.GreaterOrEqual(t, len(sn), 1)And for the non-sampled case, assert zero or near-zero only if guaranteed by the exporter config.
Also applies to: 2812-2814
2843-2846: Compare JSON bodies semantically, not as raw strings.Raw string equality on JSON can break due to spacing/ordering changes. Parse and assert fields or use require.JSONEq where applicable.
- responseBody := requestContextMap["response_body"].(string) - require.Equal(t, `{"data":{"employees":[{"id":1},{"id":2},{"id":3},{"id":4},{"id":5},{"id":7},{"id":8},{"id":10},{"id":11},{"id":12}]}}`, responseBody) + responseBody := requestContextMap["response_body"].(string) + require.JSONEq(t, `{"data":{"employees":[{"id":1},{"id":2},{"id":3},{"id":4},{"id":5},{"id":7},{"id":8},{"id":10},{"id":11},{"id":12}]}}`, responseBody)Same idea for the other response_body checks in this block.
Also applies to: 2873-2876, 3395-3418
router/internal/expr/request_operation_bucket_visitor.go (1)
36-47: Use constants for keys to prevent drift.“subgraph”, “auth”, and “operation” are string literals. If constants exist (e.g., ExprSubgraphKey, ExprRequestAuthKey, ExprRequestOperationKey), prefer them to keep tags and code in sync.
- if ident, ok := (*baseNode).(*ast.IdentifierNode); ok { - if ident.Value == "subgraph" { + if ident, ok := (*baseNode).(*ast.IdentifierNode); ok { + if ident.Value == ExprSubgraphKey { ... - if ident, ok := member.Node.(*ast.IdentifierNode); ok && ident.Value == "subgraph" { + if ident, ok := member.Node.(*ast.IdentifierNode); ok && ident.Value == ExprSubgraphKey { ... - if prop == "auth" { + if prop == ExprRequestAuthKey { ... - if opProp != "operation" { + if opProp != ExprRequestOperationKey {If these constants don’t exist yet, consider adding them next to ExprRequestKey. As per coding guidelines.
Also applies to: 57-63, 70-73
router/core/graphql_prehandler.go (2)
505-508: Avoid unnecessary expression evaluation when bucket unused.setTelemetryAttributes is invoked many times. Guard with a fast “bucket used?” check to skip work when no expressions depend on that stage.
- setTelemetryAttributes(req.Context(), requestContext, expr.BucketSha256) + if h.exprManager != nil && h.exprManager.VisitorManager.IsBucketUsed(expr.BucketSha256) { + setTelemetryAttributes(req.Context(), requestContext, expr.BucketSha256) + }Apply similarly for other buckets (Default/Auth/Parsing/Normalization/Hash/Validation/Planning).
Also applies to: 591-604, 703-704, 735-736, 856-860, 909-910, 947-948, 989-990, 1003-1004, 211-212, 358-359
865-869: Optional: also attach normalization cache hit on the router span for correlation.You set WgNormalizationCacheHit on the span; consider also mirroring it on the root/router span attributes if that’s useful in logs/metrics correlation.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
router-tests/block_operations_test.go(1 hunks)router-tests/structured_logging_test.go(1 hunks)router-tests/telemetry/telemetry_test.go(3 hunks)router/core/attribute_expressions.go(1 hunks)router/core/attribute_expressions_test.go(3 hunks)router/core/graphql_prehandler.go(22 hunks)router/internal/expr/expr.go(1 hunks)router/internal/expr/request_operation_bucket_visitor.go(1 hunks)router/internal/expr/request_operation_bucket_visitor_test.go(1 hunks)router/internal/expr/use_request_operation_sha256.go(1 hunks)router/internal/expr/uses_request_operation_sha256_test.go(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- router/internal/expr/uses_request_operation_sha256_test.go
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
PR: wundergraph/cosmo#2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the query body before any APQ operations occur.
Applied to files:
router/core/graphql_prehandler.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
PR: wundergraph/cosmo#2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function at lines 571-578, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the computed query hash before any APQ operations occur. There's also a test case that verifies this behavior.
Applied to files:
router/core/graphql_prehandler.go
🧬 Code graph analysis (8)
router/internal/expr/request_operation_bucket_visitor_test.go (2)
router/internal/expr/request_operation_bucket_visitor.go (13)
AttributeBucket(8-8)BucketDefault(11-11)BucketAuth(12-12)BucketSha256(13-13)BucketParsingTime(14-14)BucketNameOrType(15-15)BucketPersistedID(16-16)BucketNormalizationTime(17-17)BucketHash(18-18)BucketValidationTime(19-19)BucketPlanningTime(20-20)BucketSubgraph(21-21)RequestOperationBucketVisitor(27-29)router/internal/expr/expr_manager.go (1)
CreateNewExprManager(19-23)
router/core/attribute_expressions_test.go (2)
router/internal/expr/request_operation_bucket_visitor.go (3)
RequestOperationBucketVisitor(27-29)BucketAuth(12-12)BucketDefault(11-11)router/internal/expr/expr_manager.go (1)
CreateNewExprManager(19-23)
router-tests/block_operations_test.go (2)
router-tests/testenv/testenv.go (4)
Run(107-124)Config(286-342)Environment(1729-1765)GraphQLRequest(1905-1913)router/pkg/config/config.go (3)
Config(996-1070)SecurityConfiguration(408-418)BlockOperationConfiguration(403-406)
router/core/attribute_expressions.go (4)
router/internal/expr/request_operation_bucket_visitor.go (2)
AttributeBucket(8-8)RequestOperationBucketVisitor(27-29)router/pkg/config/config.go (1)
CustomAttribute(43-47)router/internal/expr/expr.go (1)
Context(35-39)router/internal/expr/resolvers.go (1)
ResolveStringExpression(22-34)
router/internal/expr/request_operation_bucket_visitor.go (1)
router/internal/expr/expr.go (1)
ExprRequestKey(31-31)
router-tests/structured_logging_test.go (2)
router-tests/testenv/testenv.go (5)
Run(107-124)Config(286-342)LogObservationConfig(388-391)Environment(1729-1765)GraphQLRequest(1905-1913)router/pkg/config/config.go (3)
Config(996-1070)CustomAttribute(43-47)CustomDynamicAttribute(36-41)
router/core/graphql_prehandler.go (2)
router/internal/expr/expr.go (4)
Context(35-39)Request(66-75)Operation(81-91)Client(93-97)router/internal/expr/request_operation_bucket_visitor.go (10)
BucketDefault(11-11)BucketAuth(12-12)BucketSha256(13-13)BucketParsingTime(14-14)BucketNameOrType(15-15)BucketPersistedID(16-16)BucketNormalizationTime(17-17)BucketHash(18-18)BucketValidationTime(19-19)BucketPlanningTime(20-20)
router-tests/telemetry/telemetry_test.go (3)
router-tests/testenv/testenv.go (4)
Run(107-124)Config(286-342)Environment(1729-1765)GraphQLRequest(1905-1913)router/pkg/trace/tracetest/tracetest.go (1)
NewInMemoryExporter(11-17)router/pkg/config/config.go (3)
Config(996-1070)CustomAttribute(43-47)CustomDynamicAttribute(36-41)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
- GitHub Check: Analyze (go)
🔇 Additional comments (20)
router-tests/block_operations_test.go (1)
126-147: LGTM! Clean test for operation-timing expression support.The test correctly verifies that
parsingTimeis available in the expression context and can be used for blocking logic. The expressionparsingTime.Nanoseconds() > 0will always evaluate to true for successfully parsed operations, which is the intended behavior to verify the timing attribute is populated and accessible.Minor note: Good use of
require.JSONEqat line 145 for robust JSON comparison.router/internal/expr/request_operation_bucket_visitor_test.go (4)
16-317: Excellent test coverage for bucket classification.The test suite comprehensively validates bucket assignment across all priority levels (Default → Subgraph) and covers edge cases including bracket notation, conditionals, and complex expressions. The priority tests correctly verify that higher-priority attributes win when multiple are present.
319-347: LGTM!Clean helper function for readable test output. All bucket types are covered.
349-366: LGTM!Good defensive test to catch accidental changes to the bucket priority ordering defined by the iota constants.
368-421: LGTM!Properly validates the
setBucketIfHigherlogic across all key scenarios, including the special case where Subgraph (highest priority) always wins.router/internal/expr/use_request_operation_sha256.go (2)
7-10: LGTM!Good practice using constants for attribute names to avoid magic strings.
17-68: LGTM!The visitor correctly detects
request.operation.sha256Hashby walking the AST member access chain. Both dot and bracket notation are properly supported, and the early-exit optimization avoids redundant work.router/core/attribute_expressions.go (6)
4-7: LGTM!New imports appropriately support the telemetry integration paths introduced in this refactor.
16-19: LGTM!The
ProgramWithKeywrapper cleanly couples compiled expressions with their attribute keys, enabling the new bucket-based storage model.
21-24: LGTM!The refactored storage using a single bucket-indexed map simplifies the design and aligns with the new bucketed evaluation model.
26-48: LGTM!The refactored constructor correctly integrates the
RequestOperationBucketVisitorto classify expressions during compilation, storing them in the appropriate bucket.
50-70: LGTM!The updated API correctly accepts a bucket parameter and resolves only expressions associated with that bucket, maintaining proper error handling.
72-125: LGTM!The new telemetry integration functions provide a clean, unified path for resolving and attaching attributes. The
AddExprOptsstruct effectively bundles parameters, andaddExpressionseliminates duplication across telemetry types.router/internal/expr/expr.go (1)
81-91: LGTM!The expanded
Operationstruct correctly exposes all new attributes (sha256Hash, timing fields, persistedId) with appropriate types and expr tags, aligning with the bucket-based expression model and test coverage.router/core/attribute_expressions_test.go (2)
13-85: LGTM!Test correctly updated to use the new
RequestOperationBucketVisitorand validates bucket assignment instead of the oldHasAuthflag.
87-148: LGTM!Test correctly validates that expressions are bucketed appropriately (non-auth in
BucketDefault, auth inBucketAuth) and that resolution works correctly when called with explicit bucket parameters.router/internal/expr/request_operation_bucket_visitor.go (1)
31-42: Bucketization logic and priority look solid.Traversal handles subgraph/auth/operation.* correctly and honors priority via setBucketIfHigher; early exit on BucketSubgraph avoids extra work.
Also applies to: 44-50, 57-63, 70-77, 80-98
router/core/graphql_prehandler.go (3)
519-526: Persisted hash vs. body validation in the correct place.The sha256 check against the provided body happens early in handleOperation, before APQ logic, matching prior design notes. Good placement.
Based on learnings
305-307: Body capture gated correctly.request.body.raw is populated only when expressions require it, reducing overhead and PII exposure. Nice.
Also applies to: 274-276
860-866: No SetAttributes-after-End patterns detected—verification complete.
…ity-to-add-sha256-to-traces
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
router-tests/telemetry/telemetry_test.go (2)
9612-10138: Comprehensive dynamic request.operation attribute testsGreat coverage across sha256Hash, name/type, persistedId, and timing fields. The skip list approach with name+kind is precise, and asserting times > 0 matches the repo preference for these telemetry tests. Consider a small follow‑up to de‑duplicate the repeated pattern via a table‑driven test harness around validateDetectedSpans, but not required.
Based on learnings
11100-11120: Tighten helpers: mark as test helper, drop no‑op, avoid pointer returnSmall cleanups improve readability and failure reporting:
- Add t.Helper() to validateDetectedSpans so require/asserter point to the caller.
- Remove the no‑op snapshot.SpanKind() statement.
- Return attribute.Value by value from getAttributeFromKey and pass it directly to validateFunc, avoiding transient pointer usage.
Apply this diff:
-func validateDetectedSpans(t *testing.T, sn []sdktrace.ReadOnlySpan, key string, skipSpans []spanEntry, validateFunc func(value attribute.Value)) int { - var detectedSpanCount int - - for _, snapshot := range sn { - attributes := snapshot.Attributes() - snapshot.SpanKind() - - spanSearchEntry := spanEntry{name: snapshot.Name(), spanKind: snapshot.SpanKind()} - value, ok := getAttributeFromKey(attributes, key) +func validateDetectedSpans(t *testing.T, sn []sdktrace.ReadOnlySpan, key string, skipSpans []spanEntry, validateFunc func(value attribute.Value)) int { + t.Helper() + var detectedSpanCount int + + for _, snapshot := range sn { + attributes := snapshot.Attributes() + + spanSearchEntry := spanEntry{name: snapshot.Name(), spanKind: snapshot.SpanKind()} + value, ok := getAttributeFromKey(attributes, key) if slices.Contains(skipSpans, spanSearchEntry) { require.False(t, ok) continue } require.True(t, ok) - validateFunc(*value) + validateFunc(value) detectedSpanCount++ } return detectedSpanCount } -func getAttributeFromKey(attrs []attribute.KeyValue, key string) (*attribute.Value, bool) { +func getAttributeFromKey(attrs []attribute.KeyValue, key string) (attribute.Value, bool) { for _, attr := range attrs { if string(attr.Key) == key { - return &attr.Value, true + return attr.Value, true } } - return nil, false + return attribute.Value{}, false }Also applies to: 11122-11129
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
router-tests/telemetry/telemetry_test.go(3 hunks)router/core/graphql_prehandler.go(23 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
PR: wundergraph/cosmo#2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.
Applied to files:
router-tests/telemetry/telemetry_test.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
PR: wundergraph/cosmo#2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the query body before any APQ operations occur.
Applied to files:
router/core/graphql_prehandler.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
PR: wundergraph/cosmo#2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function at lines 571-578, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the computed query hash before any APQ operations occur. There's also a test case that verifies this behavior.
Applied to files:
router/core/graphql_prehandler.go
🔇 Additional comments (10)
router/core/graphql_prehandler.go (8)
217-217: LGTM! Telemetry attribute evaluation at key request stages.The centralized calls to
setTelemetryAttributesat the unauthenticated and authenticated stages align with the PR's objective of evaluating expressions at multiple points in the request lifecycle.Also applies to: 364-364
583-615: LGTM! Parsing stage telemetry is correctly instrumented.The parse stage properly:
- Creates a stage-specific context (
parseCtx) for the span- Sets
ParsingTimein the expression context in both error and success paths- Calls
setTelemetryAttributeswith the stage context andBucketParsingTimeThis pattern enables expressions that depend on parsing time to be evaluated as soon as the timing is available.
684-685: LGTM! Persisted operation ID is correctly propagated to expressions.The persisted operation ID is set in the expression context and
BucketPersistedIDexpressions are evaluated only when a persisted query hash exists, which is the correct behavior.
704-791: LGTM! Normalization stage telemetry is comprehensively instrumented.The normalization stage correctly:
- Creates a stage-specific context (
normalizeCtx)- Sets
NormalizationTimein all error paths (lines 714-715, 746-747, 790-791) and the success path (lines 867-868)- Sets the operation
Hashafter normalization completes (line 870)- Calls
setTelemetryAttributeswith the appropriate buckets (BucketNormalizationTimeandBucketHash)All code paths ensure that expressions depending on normalization timing can be evaluated as soon as the timing is available.
Also applies to: 867-871, 886-886
903-903: LGTM! Validation stage telemetry is correctly instrumented.The validation stage properly:
- Creates a stage-specific context (
validationCtx)- Sets
ValidationTimein both error paths (lines 921-922, 938-939) and the success path (lines 959-960)- Calls
setTelemetryAttributeswithvalidationCtxandBucketValidationTimeThis ensures expressions that depend on validation time can be evaluated consistently across all code paths.
Also applies to: 921-922, 938-939, 959-960
979-979: LGTM! Planning stage telemetry is correctly instrumented.The planning stage properly:
- Creates a stage-specific context (
planCtx)- Sets
PlanningTimein both the error path (lines 1001-1002) and success path (lines 1015-1016)- Calls
setTelemetryAttributeswithplanCtxandBucketPlanningTimeThis completes the comprehensive telemetry instrumentation across all operation processing stages.
Also applies to: 1001-1002, 1015-1016
426-428: Add nil-safety check for exprManager.The code calls
h.exprManager.VisitorManager.IsRequestOperationSha256UsedInExpressions()without verifying thath.exprManagerorh.exprManager.VisitorManageris non-nil. If these are guaranteed to be initialized, this is safe; otherwise, a nil pointer dereference could occur.Run the following script to check if there are nil checks elsewhere or if initialization is guaranteed:
627-632: Verify that client info doesn't need to be available for BucketNameOrType expressions.The operation name and type are set (lines 627-628) and
BucketNameOrTypeexpressions are evaluated (line 630) before client information is populated viasetExpressionContextClient(line 632). If any expressions inBucketNameOrTypeneed access to client name/version, they would see empty values.Run the following script to check if client fields are used in expressions that might be evaluated at the NameOrType stage:
router-tests/telemetry/telemetry_test.go (2)
45-48: Helper for span filtering looks goodMinimal, clear struct to match skip targets by name and kind. No issues.
10141-10142: Parallelization OKUsing t.Parallel at the suite level here is fine; no changes needed.
What happens now
What this PR does
Summary by CodeRabbit
New Features
Refactor
Tests
Checklist