-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stats/opencensus: Fix flaky metrics test #6372
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zasweq
lgtm (modulo one minor change).
stats/opencensus/e2e_test.go
Outdated
// appear for server completed RPC's view (by checking for length of rows to be | ||
// 2). Returns an error if both the Unary and Streaming metric not found within | ||
// the passed context's timeout. | ||
func waitForServerCompletedRPCs(ctx context.Context, fe *fakeExporter) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused function argument: you may want to remove fe *fakeExporter
from waitForServerCompletedRPCs
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, great catch. Deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments
stats/opencensus/e2e_test.go
Outdated
if err != nil { | ||
continue | ||
} | ||
if len(rows) == 2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of a len check here, should we implicitly check for 1 Unary and 1 Streaming RPC metric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explicitly*. But sure.
stats/opencensus/e2e_test.go
Outdated
m := make(map[string]bool) | ||
for _, row := range rows { | ||
for _, tag := range row.Tags { | ||
m[tag.Value] = true | ||
} | ||
} | ||
if m["grpc.testing.TestService/UnaryCall"] && m["grpc.testing.TestService/FullDuplexCall"] { | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zasweq, but please note that this new implementation doesn't enforce anymore your initial thought of having the expected metrics in 2 different rows. I don't know if that makes a difference.
May I also suggest to return fast if the tags are found without looping through the entire arrays (well, the looping will not be expensive since the test data set here is small), with something similar to
unaryMetricFound := false
streamingMetricFound := false
for _, row := range rows {
for _, tag := range row.Tags {
if tag.Value == "grpc.testing.TestService/UnaryCall" {
unaryMetricFound = true
} else if tag.Value == "grpc.testing.TestService/FullDuplexCall" {
streamingMetricFound = true
}
if unaryMetricFound && streamingMetricFound {
return nil
}
}
}
or (if the metrics are expected to be in 2 different rows)
unaryMetricFound := false
streamingMetricFound := false
for _, row := range rows {
for _, tag := range row.Tags {
if tag.Value == "grpc.testing.TestService/UnaryCall" {
unaryMetricFound = true
break
} else if tag.Value == "grpc.testing.TestService/FullDuplexCall" {
streamingMetricFound = true
break
}
}
if unaryMetricFound && streamingMetricFound {
return nil
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is verified right after by the want declared having only the two rows declared separately and checked in cmp.Diff. I went ahead and switched to your second solution though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for suggestion.
stats/opencensus/e2e_test.go
Outdated
@@ -237,6 +237,36 @@ func distributionDataLatencyCount(vi *viewInformation, countWant int64, wantTags | |||
return nil | |||
} | |||
|
|||
// waitForServerCompletedRPCs waits until both Unary and Streaming metric rows | |||
// appear, in two seperate rows, for server completed RPC's view. Returns an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// appear, in two seperate rows, for server completed RPC's view. Returns an | |
// appear, in two separate rows, for server completed RPC's view. Returns an |
Minor typo. Should fix the test failure below. LGTM otherwise |
Fixes #6231.
This adds a sync point between Unary and Streaming RPCs recording completed RPC's Server Side and the test. The test simply waits for the RPC to finish client side, but stats.End is recorded in a defer for the Unary and Streaming RPC case after status is written to the wire. Thus, previously there was no sync point between the test and this metric being recorded. Sync at the view global level, as that is synced with exporter by Unregistering views, and will stop recording metrics after, thus has to wait for the two row emissions for Unary and Streaming RPCs at the view global level, not at the exporter level.
Verified passes over 10k runs on Forge. Previously was failing 18/10k times on Forge.
RELEASE NOTES: N/A