Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@ public void test_scaleDownToZero_whenNoRequests() {
inferenceAuditor,
meterRegistry,
true,
ONE_SECOND,
2 * ONE_SECOND,
Copy link
Contributor Author

@jan-elastic jan-elastic Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two settings:

  • timeIntervalSeconds of ONE_SECOND
  • scaleToZeroAfterNoRequestsSeconds of ATOMIC_SECOND

lead to a race condition. Adaptive allocations service checks after 1 second (the timeIntervalSeconds) whether there's been no request for 1 second (the scaleToZeroAfterNoRequestsSeconds).

On Unix-based systems this seems to always pass. Locally on my MacBook the timeWithoutRequestsSeconds is always something like 1.001 seconds, so the model is scaled to zero.

This test is probably failing on Windows builds on x86 hardware, because of a less precise system clock, see:
https://learn.microsoft.com/en-us/windows-hardware/drivers/kernel/high-resolution-timers#controlling-timer-accuracy. Not sure 100% sure, I'm no Windows expert by any means.

Increasing the timeIntervalSeconds to 2 seconds, and then checking whether there's been no request for 1 second, solves the issue. Unfortunately, this add 2 more seconds of unit test time, because the times are set in seconds instead of milliseconds.

ATOMIC_SECOND,
TWO_THOUSAND_MILLISECONDS
);
Expand All @@ -295,7 +295,7 @@ public void test_scaleDownToZero_whenNoRequests() {
return Void.TYPE;
}).when(client).execute(eq(GetDeploymentStatsAction.INSTANCE), eq(new GetDeploymentStatsAction.Request("test-deployment")), any());

safeSleep(1500);
safeSleep(2500);

verify(client, times(1)).threadPool();
verify(client, times(1)).execute(eq(GetDeploymentStatsAction.INSTANCE), any(), any());
Expand All @@ -317,7 +317,7 @@ public void test_scaleDownToZero_whenNoRequests() {
return Void.TYPE;
}).when(client).execute(eq(UpdateTrainedModelDeploymentAction.INSTANCE), any(), any());

safeSleep(1000);
safeSleep(2000);

verify(client, times(2)).threadPool();
verify(client, times(1)).execute(eq(GetDeploymentStatsAction.INSTANCE), any(), any());
Expand Down