fix: improve router.http.request.duration_milliseconds#2564
fix: improve router.http.request.duration_milliseconds#2564
Conversation
…outer_http_request_duration-and-other-request
…tion-and-other-request
…tion-and-other-request
…request_duration-and-other-request' into ale/eng-8915-router-router_http_request_duration-and-other-request
…tion-and-other-request
…outer_http_request_duration-and-other-request
…outer_http_request_duration-and-other-request
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review infoConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
WalkthroughAdds an integration test for Prometheus metrics of parallel subgraph durations and refactors engine loader hooks to use fetch-timing from context instead of a stored start time for latency reporting. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
Comment |
|
❌ Internal Query Planner CI failed to run. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2564 +/- ##
===========================================
+ Coverage 45.74% 63.52% +17.78%
===========================================
Files 1035 251 -784
Lines 139075 26759 -112316
Branches 8631 0 -8631
===========================================
- Hits 63613 16998 -46615
+ Misses 73735 8400 -65335
+ Partials 1727 1361 -366
🚀 New features to boost your workflow:
|
…outer_http_request_duration-and-other-request
There was a problem hiding this comment.
🧹 Nitpick comments (1)
router/core/engine_loader_hooks.go (1)
55-57: Remove unusedstartTimefield andengineLoaderHooksRequestContextstruct—both are dead code from the refactoring to useFetchTimingKey.The
startTimefield is declared but never populated (line 117 creates an empty struct) or accessed. More significantly, the entireengineLoaderHooksRequestContextstruct is unused—it's stored in the context at line 117 but never retrieved anywhere in the codebase. The latency measurement now relies entirely onFetchTimingKey(lines 102–103 and 167–173), making this struct and its associated context keyEngineLoaderHooksContextKeyredundant.♻️ Proposed cleanup
Remove the struct and its context key registration:
-type engineLoaderHooksRequestContext struct { - startTime time.Time -}And update line 117:
- return context.WithValue(ctx, rcontext.EngineLoaderHooksContextKey, &engineLoaderHooksRequestContext{}) + return ctxAlso remove the
EngineLoaderHooksContextKeyfromrouter/internal/context/keys.goif it has no other usages.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@router/core/engine_loader_hooks.go` around lines 55 - 57, Remove the dead engineLoaderHooksRequestContext struct and its associated context key EngineLoaderHooksContextKey: delete the type declaration for engineLoaderHooksRequestContext, remove the code that stores an empty instance into context (the place creating/storing engineLoaderHooksRequestContext in request context), and remove the EngineLoaderHooksContextKey from router/internal/context/keys.go; ensure any latency logic continues to use FetchTimingKey (do not change FetchTimingKey usages) and run a project-wide search to confirm no remaining references to engineLoaderHooksRequestContext or EngineLoaderHooksContextKey before committing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@router/core/engine_loader_hooks.go`:
- Around line 55-57: Remove the dead engineLoaderHooksRequestContext struct and
its associated context key EngineLoaderHooksContextKey: delete the type
declaration for engineLoaderHooksRequestContext, remove the code that stores an
empty instance into context (the place creating/storing
engineLoaderHooksRequestContext in request context), and remove the
EngineLoaderHooksContextKey from router/internal/context/keys.go; ensure any
latency logic continues to use FetchTimingKey (do not change FetchTimingKey
usages) and run a project-wide search to confirm no remaining references to
engineLoaderHooksRequestContext or EngineLoaderHooksContextKey before
committing.
…tion-and-other-request
…request_duration-and-other-request' into ale/eng-8915-router-router_http_request_duration-and-other-request
…tion-and-other-request
…tion-and-other-request
| if fetchTiming, ok := value.(*atomic.Int64); ok { | ||
| exprCtx.Subgraph.Request.ClientTrace.FetchDuration = time.Duration(fetchTiming.Load()) | ||
| fetchLatency = time.Duration(fetchTiming.Load()) | ||
| exprCtx.Subgraph.Request.ClientTrace.FetchDuration = fetchLatency |
There was a problem hiding this comment.
This measures the low level round trip, whereas the current measurement should measure this plus the engine hook logic that would add to the request duration.
However more importantly, would it work with GRPC data sources?
There was a problem hiding this comment.
Good catch! It would not work with gRPC data sources.
…tion-and-other-request
…tion-and-other-request
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
When a query plan makes parallel fetches, the metric router.http.request.duration_milliseconds is recorded incorrectly: it reported as latency the time of the slower fetch. With this change I set as duration the timing that I get from the engine for the fetch, that doesn't have this issue.
Summary by CodeRabbit
Tests
Bug Fixes
Checklist