-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Fix AdaptiveAllocationsScalerServiceTests.test_scaleDownToZero_whenNo… #137731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix AdaptiveAllocationsScalerServiceTests.test_scaleDownToZero_whenNo… #137731
Conversation
|
Pinging @elastic/ml-core (Team:ML) |
| meterRegistry, | ||
| true, | ||
| ONE_SECOND, | ||
| 2 * ONE_SECOND, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two settings:
timeIntervalSecondsofONE_SECONDscaleToZeroAfterNoRequestsSecondsofATOMIC_SECOND
lead to a race condition. Adaptive allocations service checks after 1 second (the timeIntervalSeconds) whether there's been no request for 1 second (the scaleToZeroAfterNoRequestsSeconds).
On Unix-based systems this seems to always pass. Locally on my MacBook the timeWithoutRequestsSeconds is always something like 1.001 seconds, so the model is scaled to zero.
This test is probably failing on Windows builds on x86 hardware, because of a less precise system clock, see:
https://learn.microsoft.com/en-us/windows-hardware/drivers/kernel/high-resolution-timers#controlling-timer-accuracy. Not sure 100% sure, I'm no Windows expert by any means.
Increasing the timeIntervalSeconds to 2 seconds, and then checking whether there's been no request for 1 second, solves the issue. Unfortunately, this add 2 more seconds of unit test time, because the times are set in seconds instead of milliseconds.
💔 Backport failed
You can use sqren/backport to manually backport by running |
fixes #136907