Skip to content

[CI] Fix test_async_scheduling.py flakiness#42455

Merged
njhill merged 1 commit into
vllm-project:mainfrom
njhill:fix-async-lp-test-flake
May 12, 2026
Merged

[CI] Fix test_async_scheduling.py flakiness#42455
njhill merged 1 commit into
vllm-project:mainfrom
njhill:fix-async-lp-test-flake

Conversation

@njhill
Copy link
Copy Markdown
Member

@njhill njhill commented May 12, 2026

This has become super flakey recently, e.g. https://buildkite.com/vllm/ci/builds/65896/canvas?sid=019e1d8a-5417-4923-8772-6383caefd793&tab=output

I think this started after #41411.

It looks like because we are now comparing prompt logprobs, ranks can be much larger and the difference between their logprob values much smaller, in some cases meaning the order changes and so the ranks can be off by one or two.

Adding tolerance to the rank comparison will hopefully resolve this.

Signed-off-by: Nick Hill <nickhill123@gmail.com>
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@njhill njhill added the ready ONLY add when PR is ready to merge/full CI is needed label May 12, 2026
@mergify mergify Bot added the v1 label May 12, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies the rank comparison in the asynchronous scheduling end-to-end tests by introducing a relative tolerance using pytest.approx. The review feedback points out that a relative tolerance is too restrictive for small integer ranks and fails to account for cases where the rank might be None, which would lead to a TypeError. A suggestion was provided to use an absolute tolerance and maintain safe handling for None values.

Comment thread tests/v1/e2e/general/test_async_scheduling.py
Copy link
Copy Markdown
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the work!

@njhill njhill enabled auto-merge (squash) May 12, 2026 21:38
@njhill njhill merged commit fe8b42e into vllm-project:main May 12, 2026
21 checks passed
@njhill njhill deleted the fix-async-lp-test-flake branch May 12, 2026 21:38
mfylcek pushed a commit to mfylcek/vllm that referenced this pull request May 19, 2026
Signed-off-by: Nick Hill <nickhill123@gmail.com>
rishitdholakia13 pushed a commit to rishitdholakia13/vllm that referenced this pull request May 19, 2026
Signed-off-by: Nick Hill <nickhill123@gmail.com>
jhu960213 pushed a commit to jhu960213/vllm that referenced this pull request May 20, 2026
Signed-off-by: Nick Hill <nickhill123@gmail.com>
h1t35h pushed a commit to h1t35h/vllm that referenced this pull request May 21, 2026
Signed-off-by: Nick Hill <nickhill123@gmail.com>
mvanhorn pushed a commit to mvanhorn/vllm that referenced this pull request Jun 4, 2026
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
knight0528 pushed a commit to knight0528/vllm that referenced this pull request Jun 8, 2026
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants