[ROCm][CI] Disable Async Scheduling For Qwen3-Next-80B-A3B-Instruct MTP Async EPLB Accuracy Test by micah-wil · Pull Request #32275 · vllm-project/vllm

micah-wil · 2026-01-13T17:32:27Z

#31998 enabled async scheduling by default with spec decoding. This exposed a bug on for the Qwen3-Next-80B-A3B-Instruct MTP Async EPLB Accuracy test, which currently only runs on AMD CI. The test hangs https://buildkite.com/vllm/amd-ci/builds/2766/summary?sid=019bb772-7097-4d58-a3c6-a282068589ed

The test hangs on after evaluating 140 prompts.

Evaluating:  11%|█         | 140/1319 [01:10<01:02, 18.73it/s](EngineCore_DP0 pid=606) INFO 01-13 13:52:55 [shm_broadcast.py:542] No available shared memory broadcast block found in 60 seconds. This typically happens when some processes are hanging or doing some time-consuming work (e.g. compilation, weight/kv cache quantization).

Here we disable async scheduling again to unblock CI while we investigate the issue. This should only impact AMD CI.

Signed-off-by: Micah Williamson <micah.williamson@amd.com>

gemini-code-assist

Code Review

This pull request disables asynchronous scheduling for the Qwen3-Next-80B-A3B-Instruct MTP Async EPLB Accuracy test on AMD CI. This is a temporary workaround to resolve a hang that occurs when async scheduling is used with speculative decoding for this model. The change is targeted and effectively unblocks the CI pipeline. My review identifies one high-severity issue. While the workaround is correct, it highlights a contradiction between the code's behavior and its documentation regarding the automatic disabling of async scheduling with speculative decoding. It's important to address this discrepancy to maintain code and documentation quality.

gemini-code-assist · 2026-01-13T17:33:28Z

.buildkite/scripts/scheduled_integration_test/qwen3_next_mtp_async_eplb.sh

    --speculative-config '{"method":"qwen3_next_mtp","num_speculative_tokens":1}' \
    --trust-remote-code \
    --max-model-len 2048 \
+    --no-async-scheduling \


While this flag is a good temporary fix for the CI, its necessity points to a deeper issue. The documentation for SchedulerConfig.async_scheduling in vllm/config/scheduler.py states that it should be automatically disabled when speculative decoding is used. This PR is required because async scheduling is not being automatically disabled, which causes a hang.

This discrepancy indicates that either the documentation is outdated or there is a bug in the logic that should automatically disable this feature. This should be addressed to prevent confusion and future issues. Please consider creating a follow-up issue to either update the documentation or fix the auto-disabling logic.

This discrepancy indicates that either the documentation is outdated or there is a bug in the logic that should automatically disable this feature.

documentation is outdated

simon-mo · 2026-01-13T18:38:02Z

Please put the flag conditional to rocm platform as well. We might run this test in other platform and I don't want it to be silently disabled.

Signed-off-by: Micah Williamson <micah.williamson@amd.com>

micah-wil · 2026-01-13T19:53:15Z

@simon-mo Done

tjtanaa

LGTM

…TP Async EPLB Accuracy Test (vllm-project#32275) Signed-off-by: Micah Williamson <micah.williamson@amd.com>

…TP Async EPLB Accuracy Test (vllm-project#32275) Signed-off-by: Micah Williamson <micah.williamson@amd.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

…TP Async EPLB Accuracy Test (vllm-project#32275) Signed-off-by: Micah Williamson <micah.williamson@amd.com>

micah-wil added 2 commits January 13, 2026 17:08

disable async scheduling for qwen3 mtp async eplb test

e628153

Signed-off-by: Micah Williamson <micah.williamson@amd.com>

remove erroneous backslash

3b4ceab

Signed-off-by: Micah Williamson <micah.williamson@amd.com>

mergify bot added ci/build qwen Related to Qwen models rocm Related to AMD ROCm labels Jan 13, 2026

gemini-code-assist bot reviewed Jan 13, 2026

View reviewed changes

block changes on if current platform is rocm

69282a5

Signed-off-by: Micah Williamson <micah.williamson@amd.com>

tjtanaa approved these changes Jan 14, 2026

View reviewed changes

tjtanaa added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 14, 2026

tjtanaa merged commit 6fa6e7e into vllm-project:main Jan 14, 2026
18 of 19 checks passed

sammysun0711 pushed a commit to sammysun0711/vllm that referenced this pull request Jan 16, 2026

[ROCm][CI] Disable Async Scheduling For Qwen3-Next-80B-A3B-Instruct M…

c1d50cd

…TP Async EPLB Accuracy Test (vllm-project#32275) Signed-off-by: Micah Williamson <micah.williamson@amd.com>

akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026

[ROCm][CI] Disable Async Scheduling For Qwen3-Next-80B-A3B-Instruct M…

5fa65a0

…TP Async EPLB Accuracy Test (vllm-project#32275) Signed-off-by: Micah Williamson <micah.williamson@amd.com>

ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026

[ROCm][CI] Disable Async Scheduling For Qwen3-Next-80B-A3B-Instruct M…

139e42c

…TP Async EPLB Accuracy Test (vllm-project#32275) Signed-off-by: Micah Williamson <micah.williamson@amd.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ROCm][CI] Disable Async Scheduling For Qwen3-Next-80B-A3B-Instruct MTP Async EPLB Accuracy Test#32275

[ROCm][CI] Disable Async Scheduling For Qwen3-Next-80B-A3B-Instruct MTP Async EPLB Accuracy Test#32275
tjtanaa merged 3 commits intovllm-project:mainfrom
ROCm:micah/qwen-mtp-async-scheduling

micah-wil commented Jan 13, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 13, 2026

Uh oh!

micah-wil Jan 13, 2026

Uh oh!

simon-mo commented Jan 13, 2026

Uh oh!

micah-wil commented Jan 13, 2026

Uh oh!

tjtanaa left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

micah-wil commented Jan 13, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

micah-wil Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

simon-mo commented Jan 13, 2026

Uh oh!

micah-wil commented Jan 13, 2026

Uh oh!

tjtanaa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

micah-wil commented Jan 13, 2026 •

edited by github-actions bot

Loading