stats/opentelemetry: record retry attempts from clientStream#8342
stats/opentelemetry: record retry attempts from clientStream#8342eshitachandwani merged 56 commits intogrpc:masterfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #8342 +/- ##
==========================================
- Coverage 81.64% 81.50% -0.15%
==========================================
Files 413 413
Lines 40621 40693 +72
==========================================
- Hits 33167 33166 -1
- Misses 5991 6000 +9
- Partials 1463 1527 +64
🚀 New features to boost your workflow:
|
purnesh42H
left a comment
There was a problem hiding this comment.
I remember we had separate tests for retries. This change should only affect that test. This change shouldn't affect tests which are doing only single attempt.
purnesh42H
left a comment
There was a problem hiding this comment.
@vinothkumarr227 have you tested this to ensure its working correctly? I was under impression that TestTraceSpan_WithRetriesAndNameResolutionDelay will need changes for expected values. I think its not working as expected because you are not setting the count back to ctx after incrementing.
dfawley
left a comment
There was a problem hiding this comment.
Sorry for the delays in review here. LGTM after this one small change.
| method: determineMethod(method, opts...), | ||
| target: cc.CanonicalTarget(), | ||
| method: determineMethod(method, opts...), | ||
| previousRPCAttempts: new(atomic.Uint32), |
There was a problem hiding this comment.
Let's make this a non-pointer type and then we don't need the new here or the chance to get nil panics.
| target: cc.CanonicalTarget(), | ||
| method: determineMethod(method, opts...), | ||
| previousRPCAttempts: new(atomic.Uint32), | ||
| previousRPCAttempts: atomic.Uint32{}, |
There was a problem hiding this comment.
Please delete this line. This is the zero value so it doesn't need explicit initialization.
…8342)" (#8571) This introduced flakiness in a test - Test/TraceSpan_WithRetriesAndNameResolutionDelay Failure: https://github.com/grpc/grpc-go/actions/runs/17614152882/job/50042942932?pr=8547 Related issue: #8299 RELEASE NOTES: None
…tionDelay This commit fixes the flaky test TestTraceSpan_WithRetriesAndNameResolutionDelay which was introduced in PR grpc#8342 and caused that PR to be reverted. Root Cause: The test had race conditions related to timing: 1. The goroutine that updates resolver state could complete before or after the delayed resolution event was fully processed and recorded in spans 2. Span export timing was not synchronized with test validation, causing the test to sometimes check spans before they were fully exported Fix: 1. Added 'stateUpdated' event to synchronize between the resolver state update completing and span validation beginning 2. Added explicit wait for the stateUpdated event before validating spans 3. Added a 50ms sleep after RPC completion to give the span exporter time to process and export all spans before validation Testing: - Test now passes consistently (10+ consecutive runs) - Passes with race detector enabled (-race flag) - No data races detected Fixes grpc#8700
Fixes: grpc#8299 RELEASE NOTES: - stats/opentelemetry: Retry attempts (`grpc.previous-rpc-attempts`) are now recorded as span attributes for non-transparent client retries.
…tionDelay This commit fixes the flaky test TestTraceSpan_WithRetriesAndNameResolutionDelay which was introduced in the previous commit and caused PR grpc#8342 to be reverted. Root Cause: The test had race conditions related to timing: 1. The goroutine that updates resolver state could complete before or after the delayed resolution event was fully processed and recorded in spans 2. Span export timing was not synchronized with test validation, causing the test to sometimes check spans before they were fully exported Fix: 1. Added 'stateUpdated' event to synchronize between the resolver state update completing and span validation beginning 2. Added explicit wait for the stateUpdated event before validating spans 3. Added a 50ms sleep after RPC completion to give the span exporter time to process and export all spans before validation Testing: - Test now passes consistently (10+ consecutive runs) - Passes with race detector enabled (-race flag) - No data races detected Fixes grpc#8700
Fixes: grpc#8299 RELEASE NOTES: - stats/opentelemetry: Retry attempts (`grpc.previous-rpc-attempts`) are now recorded as span attributes for non-transparent client retries.
…tionDelay This commit fixes the flaky test TestTraceSpan_WithRetriesAndNameResolutionDelay which was introduced in the previous commit and caused PR grpc#8342 to be reverted. Root Cause: The test had race conditions related to timing: 1. The goroutine that updates resolver state could complete before or after the delayed resolution event was fully processed and recorded in spans 2. Span export timing was not synchronized with test validation, causing the test to sometimes check spans before they were fully exported Fix: 1. Added 'stateUpdated' event to synchronize between the resolver state update completing and span validation beginning 2. Added explicit wait for the stateUpdated event before validating spans 3. Added a 50ms sleep after RPC completion to give the span exporter time to process and export all spans before validation Testing: - Test now passes consistently (10+ consecutive runs) - Passes with race detector enabled (-race flag) - No data races detected Fixes grpc#8700
Fixes: #8299
RELEASE NOTES:
grpc.previous-rpc-attempts) are now recorded as span attributes for non-transparent client retries.