[Tests] Replace flaky sleep with polling in test_background_cancel by sjhddh · Pull Request #32986 · vllm-project/vllm

sjhddh · 2026-01-23T23:20:22Z

Summary

Replace the fixed 0.5s sleep (marked with # FIXME: This test can be flaky) with a proper polling loop that waits for the response status to change before attempting to cancel.

Changes

Poll with 0.1s intervals until status changes from "queued" (max 5s)
Use proper assertions in the post-cancel verification loop
Remove FIXME comment as the flakiness is addressed

Motivation

Fixed sleeps are unreliable: too short on slow machines, wasteful on fast ones
Polling adapts to actual server response times
Reduces test flakiness in CI

Test Plan

Run pytest tests/v1/entrypoints/openai/serving_responses/test_stateful.py::test_background_cancel multiple times
Verify test passes consistently

Risk

Low risk: only changes test synchronization logic
No changes to production code

gemini-code-assist

Code Review

This pull request improves the reliability of test_background_cancel by replacing a fixed sleep with a polling mechanism. This is a great change to address test flakiness. I've suggested a further refinement to the polling loop to use a monotonic clock for more accurate timeout handling, which will make the test even more robust.

gemini-code-assist · 2026-01-23T23:21:56Z

tests/v1/entrypoints/openai/serving_responses/test_stateful.py

+    max_wait_seconds = 5.0
+    poll_interval = 0.1
+    elapsed = 0.0
+    while elapsed < max_wait_seconds:
+        await asyncio.sleep(poll_interval)
+        elapsed += poll_interval
+        response = await client.responses.retrieve(response.id)
+        if response.status != "queued":
+            # Started processing or completed - try to cancel
+            break


The current polling loop uses elapsed += poll_interval to track time, but this doesn't account for the time spent in the client.responses.retrieve(response.id) call. This can cause the effective timeout to be significantly longer than max_wait_seconds if the API call is slow, potentially slowing down tests in CI.

Using a monotonic clock like asyncio.get_running_loop().time() provides a more accurate and robust timeout mechanism. I've also reordered the loop to check the condition before sleeping, which is slightly more efficient.

Suggested change

max_wait_seconds = 5.0

poll_interval = 0.1

elapsed = 0.0

while elapsed < max_wait_seconds:

await asyncio.sleep(poll_interval)

elapsed += poll_interval

response = await client.responses.retrieve(response.id)

if response.status != "queued":

# Started processing or completed - try to cancel

break

loop = asyncio.get_running_loop()

start_time = loop.time()

max_wait_seconds = 5.0

poll_interval = 0.1

while loop.time() - start_time < max_wait_seconds:

response = await client.responses.retrieve(response.id)

if response.status != "queued":

# Started processing or completed - try to cancel

break

await asyncio.sleep(poll_interval)

Replace the fixed 0.5s sleep (which was marked as flaky) with a proper polling loop that waits for the response status to change before attempting to cancel. This makes the test more deterministic: - Poll with 0.1s intervals until status changes from "queued" - Use proper assertions in the post-cancel verification loop - Remove FIXME comment as the flakiness is addressed Signed-off-by: 7. Sun <jhao.sun@gmail.com>

Signed-off-by: 7. Sun <jhao.sun@gmail.com>

robertgshaw2-redhat · 2026-01-24T15:04:32Z

Thanks for your contribution

…llm-project#32986) Signed-off-by: 7. Sun <jhao.sun@gmail.com> Signed-off-by: Mieszko Syty <mieszko@ms1design.pl>

…llm-project#32986) Signed-off-by: 7. Sun <jhao.sun@gmail.com> Signed-off-by: 陈建华 <1647430658@qq.com>

…llm-project#32986) Signed-off-by: 7. Sun <jhao.sun@gmail.com>

mergify bot added the v1 label Jan 23, 2026

gemini-code-assist bot reviewed Jan 23, 2026

View reviewed changes

sjhddh force-pushed the tests/replace-sleep-with-polling branch from fcf361f to 20e45e4 Compare January 23, 2026 23:24

Address review: use monotonic clock for accurate timeout

2ca94c6

Signed-off-by: 7. Sun <jhao.sun@gmail.com>

robertgshaw2-redhat enabled auto-merge (squash) January 24, 2026 15:04

robertgshaw2-redhat approved these changes Jan 24, 2026

View reviewed changes

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 24, 2026

robertgshaw2-redhat merged commit cd775bd into vllm-project:main Jan 24, 2026
21 checks passed

ms1design pushed a commit to ms1design/vllm that referenced this pull request Jan 24, 2026

[Tests] Replace flaky sleep with polling in test_background_cancel (v…

f94a373

…llm-project#32986) Signed-off-by: 7. Sun <jhao.sun@gmail.com> Signed-off-by: Mieszko Syty <mieszko@ms1design.pl>

cwazai pushed a commit to cwazai/vllm that referenced this pull request Jan 25, 2026

[Tests] Replace flaky sleep with polling in test_background_cancel (v…

7203206

…llm-project#32986) Signed-off-by: 7. Sun <jhao.sun@gmail.com> Signed-off-by: 陈建华 <1647430658@qq.com>

ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026

[Tests] Replace flaky sleep with polling in test_background_cancel (v…

d2e2b4b

…llm-project#32986) Signed-off-by: 7. Sun <jhao.sun@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Tests] Replace flaky sleep with polling in test_background_cancel#32986

[Tests] Replace flaky sleep with polling in test_background_cancel#32986
robertgshaw2-redhat merged 2 commits intovllm-project:mainfrom
sjhddh:tests/replace-sleep-with-polling

sjhddh commented Jan 23, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Uh oh!

robertgshaw2-redhat commented Jan 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

sjhddh commented Jan 23, 2026

Summary

Changes

Motivation

Test Plan

Risk

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

robertgshaw2-redhat commented Jan 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants