[ROCm][CI] Disable async scheduling on ROCm for test_structured_output[meta-llama/Meta-Llama-3.1-8B-Instruct-xgrammar-auto-speculative_config9]#32355
Merged
tjtanaa merged 3 commits intovllm-project:mainfrom Jan 15, 2026
Conversation
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
Contributor
There was a problem hiding this comment.
Code Review
This pull request introduces a temporary fix to disable asynchronous scheduling for a failing test on ROCm. The implementation disables it for all parameterizations of test_structured_output on ROCm. My review suggests making this fix more targeted to only the specific failing test case. This will help maintain test coverage for asynchronous scheduling on ROCm for other, potentially passing, test cases.
Contributor
Author
|
Actively working on a fix in #32303, but adding this workaround in the meantime |
gshtras
approved these changes
Jan 14, 2026
sammysun0711
pushed a commit
to sammysun0711/vllm
that referenced
this pull request
Jan 16, 2026
…t[meta-llama/Meta-Llama-3.1-8B-Instruct-xgrammar-auto-speculative_config9] (vllm-project#32355) Signed-off-by: Micah Williamson <micah.williamson@amd.com>
micah-wil
added a commit
to ROCm/vllm
that referenced
this pull request
Jan 16, 2026
…ed_output[meta-llama/Meta-Llama-3.1-8B-Instruct-xgrammar-auto-speculative_config9] (vllm-project#32355)" This reverts commit 773d707. Signed-off-by: Micah Williamson <micah.williamson@amd.com>
akh64bit
pushed a commit
to akh64bit/vllm
that referenced
this pull request
Jan 16, 2026
…t[meta-llama/Meta-Llama-3.1-8B-Instruct-xgrammar-auto-speculative_config9] (vllm-project#32355) Signed-off-by: Micah Williamson <micah.williamson@amd.com>
dsuhinin
pushed a commit
to dsuhinin/vllm
that referenced
this pull request
Jan 21, 2026
…t[meta-llama/Meta-Llama-3.1-8B-Instruct-xgrammar-auto-speculative_config9] (vllm-project#32355) Signed-off-by: Micah Williamson <micah.williamson@amd.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX
pushed a commit
to ItzDEXX/vllm
that referenced
this pull request
Feb 19, 2026
…t[meta-llama/Meta-Llama-3.1-8B-Instruct-xgrammar-auto-speculative_config9] (vllm-project#32355) Signed-off-by: Micah Williamson <micah.williamson@amd.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The issue was exposed in the V1 Test entrypoints test group in AMD CI after #31998 enabled async scheduling by default with spec decoding. The test group has been failing ever since that PR was merged(e.g. in build#2803). It passes if you set async_scheduling=False. The failure can be reproduced with this command:
pytest -v -s tests/v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output[meta-llama/Meta-Llama-3.1-8B-Instruct-xgrammar-auto-speculative_config9].To unblock AMD CI, I am proposing that we disable async scheduling for this test case until we are able to track down and fix the underlying issue.