-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't rely on test code execution time span for RemoteSegmentTransferTrackerTests #15187
Conversation
Please add |
Idea for possible future improvement: We could remove all direct calls to If there is an agreement that this would be beneficial then we can open a new ticket. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #15187 +/- ##
============================================
+ Coverage 71.90% 71.92% +0.01%
- Complexity 63033 63114 +81
============================================
Files 5197 5197
Lines 295313 295313
Branches 42677 42677
============================================
+ Hits 212354 212390 +36
- Misses 65552 65607 +55
+ Partials 17407 17316 -91 ☔ View full report in Codecov by Sentry. |
server/src/test/java/org/opensearch/index/remote/RemoteSegmentTransferTrackerTests.java
Outdated
Show resolved
Hide resolved
Thanks for raising a fix for this flaky test @lukas-vlcek, seems like a hot one.
As of now we have always relied of directly using the |
510c03a
to
18dbf31
Compare
❌ Gradle check result for 510c03a: UNSTABLE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 18dbf31: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❕ Gradle check result for d1cc5c7: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
server/src/test/java/org/opensearch/index/remote/RemoteSegmentTransferTrackerTests.java
Outdated
Show resolved
Hide resolved
Mostly changes look good. One thing is we should run multiple iterations locally to make sure we have not regressed |
Current implementation of [`RemoteSegmentTransferTrackerTests.testComputeTimeLagOnUpdate()`](https://github.com/opensearch-project/OpenSearch/blob/2b17902643738f0d2a75ade7c85cbca94d18ce49/server/src/test/java/org/opensearch/index/remote/RemoteSegmentTransferTrackerTests.java#L139) test rely on some assumptions about how fast the testing code will finish in JVM. Moreover it does not precisely control boundaries of the time span, specifically the start of the span because it is determined by internal implementation of [`RemoteSegmentTransferTracker.getTimeMsLag()`](https://github.com/opensearch-project/OpenSearch/blob/2b17902643738f0d2a75ade7c85cbca94d18ce49/server/src/main/java/org/opensearch/index/remote/RemoteSegmentTransferTracker.java#L262) which indirectly makes call to `System.nanoTime()`. This commit loosens the assumption that the test code execution will finish within +/-20ms. Instead it only assumes that the execution time span won't be shorter than predefined (and controlled) thread sleep interval and any larger interval value is considered a success. The whole point of this test is not to verify execution speed with defined precision. Instead the point is that the [`getTimeMsLag()`](https://github.com/opensearch-project/OpenSearch/blob/2b17902643738f0d2a75ade7c85cbca94d18ce49/server/src/main/java/org/opensearch/index/remote/RemoteSegmentTransferTracker.java#L262) method returns either 0 (for specific conditions) or possitive number (assuming that `remoteRefreshStartTimeMs` is not greater than `System.nanoTime()`). Closes: opensearch-project#14325 Signed-off-by: Lukáš Vlček <[email protected]>
@linuxpi As for |
❕ Gradle check result for 6e191a3: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
…TrackerTests (#15187) Signed-off-by: Lukáš Vlček <[email protected]> (cherry picked from commit ef1a79f) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…TrackerTests (#15187) (#15244) (cherry picked from commit ef1a79f) Signed-off-by: Lukáš Vlček <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…TrackerTests (opensearch-project#15187) Signed-off-by: Lukáš Vlček <[email protected]>
…TrackerTests (opensearch-project#15187) Signed-off-by: Lukáš Vlček <[email protected]>
Description
Current implementation of
RemoteSegmentTransferTrackerTests.testComputeTimeLagOnUpdate()
test rely on some assumptions about how fast the testing code will finish in JVM. Moreover it does not precisely control boundaries of the time span, specifically the start of the span because it is determined by internal implementation ofRemoteSegmentTransferTracker.getTimeMsLag()
which indirectly makes call toSystem.nanoTime()
.This commit loosens the assumption that the test code execution will finish within +/-20ms. Instead it only assumes that the execution time span won't be shorter than predefined (and controlled) thread sleep interval and any larger interval value is considered a success.
The whole point of this test is not to verify execution speed with defined precision. Instead the point is that the
getTimeMsLag()
method returns either 0 (for specific conditions) or possitive number (assuming thatremoteRefreshStartTimeMs
is not greater thanSystem.nanoTime()
).Related Issues
Closes: #14325
Check List
[ ] API changes companion pull request created, if applicable.[ ] Public documentation issue/PR created, if applicable.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.