[CI] [ROCm] Setup `test-ready.yml` and `test-merge.yml` by tjtanaa · Pull Request #2017 · vllm-project/vllm-omni

tjtanaa · 2026-03-19T13:36:47Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

This PR is to align with CUDA unit tests.

Test Plan

Triggered test-ready.yml and test-merge.yml

Test Result

test merge all passed https://buildkite.com/vllm/vllm-omni-amd-ci/builds/3451/steps/canvas

test ready https://buildkite.com/vllm/vllm-omni-amd-ci/builds/3474/steps/canvas

Plan

vLLM Main repo has move some of the test to Mi250 GPUs. I will monitor the state of the cluster first. Because it was not stable yet. Once it is stable I will move some of the tests over to cut down the wait time.
Enable test-nightly.yml in coming PR.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

… test Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8801f26254

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-19T13:46:53Z

+  mirror_hardwares: [amdproduction]
+  grade: Blocking
+  commands:
+    - timeout 20m pytest -s -v tests/e2e/offline_inference/test_t2i_model.py -m "core_model and diffusion" --run-level "core_model"


Point the ready ROCm diffusion gate at a model that runs

This command never exercises a ROCm diffusion model in the ready suite. In tests/e2e/offline_inference/test_t2i_model.py:29-33, the ROCm parametrization removes riverclouds/qwen_image_random, and test_diffusion_model() then skips every core_model case unless the model is riverclouds/qwen_image_random (:42-43). On MI325, this blocking step therefore goes green with all tests skipped, so PRs can break ready-level ROCm diffusion coverage without AMD CI noticing.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-19T13:46:53Z

 @pytest.mark.advanced_model
 @pytest.mark.diffusion
-@hardware_test(res={"cuda": "H100"})
+@hardware_test(res={"cuda": "H100", "rocm": "MI325"})


Include Bagel text2img in the ROCm online gate

The new AMD ready/merge jobs run test_bagel_online.py with -k "rocm", and pytest's -k matching includes marker keywords. After this change only test_bagel_img2img_online carries the ROCm hardware mark, while test_bagel_text2img_online is still H100-only (tests/e2e/online_serving/test_bagel_online.py:84). That means the ROCm CI never hits the Bagel text2img serving path, so a regression there will still pass the new AMD gate.

Useful? React with 👍 / 👎.

hsliuustc0106

lgtm

…#2017) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

…#2017) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> Signed-off-by: yiliu30 <yi4.liu@intel.com>

### vllm-omni-audio-tts - Source: [PR #2059](vllm-project/vllm-omni#2059) - [BugFix][Qwen3TTS] CodePredictor CudaGraph Pool - Changes: - Bug fix: [BugFix][Qwen3TTS] CodePredictor CudaGraph Pool ### vllm-omni-perf - Source: [PR #2059](vllm-project/vllm-omni#2059) - [BugFix][Qwen3TTS] CodePredictor CudaGraph Pool - Changes: - Bug fix: [BugFix][Qwen3TTS] CodePredictor CudaGraph Pool ### vllm-omni-api - Source: [PR #2058](vllm-project/vllm-omni#2058) - [Bugfix] Fix Fish Speech and CosyVoice3 online serving - missing is_comprehension and broken model detection - Changes: - Bug fix: [Bugfix] Fix Fish Speech and CosyVoice3 online serving - missing is_comprehension and broken model detection ### vllm-omni-contrib - Source: [PR #2045](vllm-project/vllm-omni#2045) - [Voxtral] Improve example ### vllm-omni-cicd - Source: [PR #2045](vllm-project/vllm-omni#2045) - [Voxtral] Improve example ### vllm-omni-api - Source: [PR #2042](vllm-project/vllm-omni#2042) - [bugfix] /chat/completion doesn't read extra_body for diffusion model - Changes: - Bug fix: [bugfix] /chat/completion doesn't read extra_body for diffusion model ### vllm-omni-perf - Source: [PR #2042](vllm-project/vllm-omni#2042) - [bugfix] /chat/completion doesn't read extra_body for diffusion model - Changes: - Bug fix: [bugfix] /chat/completion doesn't read extra_body for diffusion model ### vllm-omni-contrib - Source: [PR #2038](vllm-project/vllm-omni#2038) - [Doc] Update docs and dockerfiles for rebase of vllm v0.18.0 ### vllm-omni-serving - Source: [PR #2037](vllm-project/vllm-omni#2037) - [Rebase] Rebase to vllm v0.18.0 ### vllm-omni-contrib - Source: [PR #2037](vllm-project/vllm-omni#2037) - [Rebase] Rebase to vllm v0.18.0 ### vllm-omni-api - Source: [PR #2037](vllm-project/vllm-omni#2037) - [Rebase] Rebase to vllm v0.18.0 ### vllm-omni-cicd - Source: [PR #2037](vllm-project/vllm-omni#2037) - [Rebase] Rebase to vllm v0.18.0 ### vllm-omni-cicd - Source: [PR #2032](vllm-project/vllm-omni#2032) - [CI] Change Bagel online test environment variable `VLLM_TEST_CLEAN_GPU_MEMORY` to `0` ### vllm-omni-cicd - Source: [PR #2031](vllm-project/vllm-omni#2031) - [CI] Fix test. - Changes: - Bug fix: [CI] Fix test. ### vllm-omni-cicd - Source: [PR #2017](vllm-project/vllm-omni#2017) - [CI] [ROCm] Setup `test-ready.yml` and `test-merge.yml` ### vllm-omni-cicd - Source: [PR #2014](vllm-project/vllm-omni#2014) - [Test] Implement mock HTTP request handling in benchmark CLI tests ### vllm-omni-perf - Source: [PR #2014](vllm-project/vllm-omni#2014) - [Test] Implement mock HTTP request handling in benchmark CLI tests ### vllm-omni-serving - Source: [PR #2012](vllm-project/vllm-omni#2012) - [Fixbug][Perf] Qwen3-omni: code predictor with re-prefill + SDPA and eliminate decode hot-path CPU round-trips - Changes: - Bug fix: [Fixbug][Perf] Qwen3-omni: code predictor with re-prefill + SDPA and eliminate decode hot-path CPU round-trips ### vllm-omni-image-gen - Source: [PR #2012](vllm-project/vllm-omni#2012) - [Fixbug][Perf] Qwen3-omni: code predictor with re-prefill + SDPA and eliminate decode hot-path CPU round-trips - Changes: - Bug fix: [Fixbug][Perf] Qwen3-omni: code predictor with re-prefill + SDPA and eliminate decode hot-path CPU round-trips ### vllm-omni-perf - Source: [PR #2012](vllm-project/vllm-omni#2012) - [Fixbug][Perf] Qwen3-omni: code predictor with re-prefill + SDPA and eliminate decode hot-path CPU round-trips - Changes: - Bug fix: [Fixbug][Perf] Qwen3-omni: code predictor with re-prefill + SDPA and eliminate decode hot-path CPU round-trips ### vllm-omni-serving - Source: [PR #2009](vllm-project/vllm-omni#2009) - [Bugfix] revert PR#1758 which introduced the accuracy problem of qwen3-omni - Changes: - Bug fix: [Bugfix] revert PR#1758 which introduced the accuracy problem of qwen3-omni ### vllm-omni-image-gen - Source: [PR #2007](vllm-project/vllm-omni#2007) - [Bugfix]Fix bug of online server can not return mutli images - Changes: - Bug fix: [Bugfix]Fix bug of online server can not return mutli images - Additions: - Qwen-Image-Layered - Qwen-Image-Layered - Qwen-Image-Layered ### vllm-omni-api - Source: [PR #2007](vllm-project/vllm-omni#2007) - [Bugfix]Fix bug of online server can not return mutli images - Changes: - Bug fix: [Bugfix]Fix bug of online server can not return mutli images ### vllm-omni-cicd - Source: [PR #1998](vllm-project/vllm-omni#1998) - [CI] Split BAGEL tests into dummy/real weight tiers (L2/L3) ### vllm-omni-serving - Source: [PR #1985](vllm-project/vllm-omni#1985) - [Perf] [Qwen3-TTS] Keep audio_codes and last_talker_hidden on GPU to eliminate per-step sync stalls - Changes: - Performance improvement: [Perf] [Qwen3-TTS] Keep audio_codes and last_talker_hidden on GPU to eliminate per-step sync stalls ### vllm-omni-audio-tts - Source: [PR #1985](vllm-project/vllm-omni#1985) - [Perf] [Qwen3-TTS] Keep audio_codes and last_talker_hidden on GPU to eliminate per-step sync stalls - Changes: - Performance improvement: [Perf] [Qwen3-TTS] Keep audio_codes and last_talker_hidden on GPU to eliminate per-step sync stalls ### vllm-omni-perf - Source: [PR #1985](vllm-project/vllm-omni#1985) - [Perf] [Qwen3-TTS] Keep audio_codes and last_talker_hidden on GPU to eliminate per-step sync stalls - Changes: - Performance improvement: [Perf] [Qwen3-TTS] Keep audio_codes and last_talker_hidden on GPU to eliminate per-step sync stalls ### vllm-omni-serving - Source: [PR #1984](vllm-project/vllm-omni#1984) - [CI] [ROCm] Bugfix device environment issue - Changes: - Bug fix: [CI] [ROCm] Bugfix device environment issue ### vllm-omni-api - Source: [PR #1984](vllm-project/vllm-omni#1984) - [CI] [ROCm] Bugfix device environment issue - Changes: - Bug fix: [CI] [ROCm] Bugfix device environment issue ### vllm-omni-serving - Source: [PR #1982](vllm-project/vllm-omni#1982) - [Fix] Fix slow hasattr in CUDAGraphWrapper.__getattr__ - Changes: - Bug fix: [Fix] Fix slow hasattr in CUDAGraphWrapper.__getattr__ ### vllm-omni-cicd - Source: [PR #1982](vllm-project/vllm-omni#1982) - [Fix] Fix slow hasattr in CUDAGraphWrapper.__getattr__ - Changes: - Bug fix: [Fix] Fix slow hasattr in CUDAGraphWrapper.__getattr__ ### vllm-omni-api - Source: [PR #1979](vllm-project/vllm-omni#1979) - [Bugfix] Fix config misalignment between offline and online diffusion inference (Wan2.2, Qwen-Image series) - Changes: - Bug fix: [Bugfix] Fix config misalignment between offline and online diffusion inference (Wan2.2, Qwen-Image series) - Additions: - `/v1/chat/completions` ### vllm-omni-perf - Source: [PR #1979](vllm-project/vllm-omni#1979) - [Bugfix] Fix config misalignment between offline and online diffusion inference (Wan2.2, Qwen-Image series) - Changes: - Bug fix: [Bugfix] Fix config misalignment between offline and online diffusion inference (Wan2.2, Qwen-Image series) ### vllm-omni-contrib - Source: [PR #1976](vllm-project/vllm-omni#1976) - [skip ci][Docs] Update WeChat QR code (fix filename case) - Changes: - Bug fix: [skip ci][Docs] Update WeChat QR code (fix filename case) ### vllm-omni-contrib - Source: [PR #1974](vllm-project/vllm-omni#1974) - [Docs] Update WeChat QR code for community support ### vllm-omni-cicd - Source: [PR #1945](vllm-project/vllm-omni#1945) - Fix Base voice clone streaming quality and stop-token crash - Changes: - Bug fix: Fix Base voice clone streaming quality and stop-token crash ### vllm-omni-cicd - Source: [PR #1938](vllm-project/vllm-omni#1938) - [Test] L4 complete diffusion feature test for Bagel models - Changes: - New feature: [Test] L4 complete diffusion feature test for Bagel models ### vllm-omni-perf - Source: [PR #1938](vllm-project/vllm-omni#1938) - [Test] L4 complete diffusion feature test for Bagel models - Changes: - New feature: [Test] L4 complete diffusion feature test for Bagel models ### vllm-omni-perf - Source: [PR #1934](vllm-project/vllm-omni#1934) - Fix OmniGen2 transformer config loading for HF models - Changes: - Bug fix: Fix OmniGen2 transformer config loading for HF models ### vllm-omni-audio-tts - Source: [PR #1930](vllm-project/vllm-omni#1930) - [Bug][Qwen3TTS][Streaming] remove dynamic initial chunk and only compute on initial request ### vllm-omni-perf - Source: [PR #1930](vllm-project/vllm-omni#1930) - [Bug][Qwen3TTS][Streaming] remove dynamic initial chunk and only compute on initial request ### vllm-omni-audio-tts - Source: [PR #1926](vllm-project/vllm-omni#1926) - [Misc] removed qwen3_tts.py as it is out-dated ### vllm-omni-contrib - Source: [PR #1920](vllm-project/vllm-omni#1920) - [Docs] Add Wan2.1-T2V as supported video generation models - Changes: - New feature: [Docs] Add Wan2.1-T2V as supported video generation models ### vllm-omni-video-gen - Source: [PR #1915](vllm-project/vllm-omni#1915) - [Bugfix] fix helios video generate use cpu device - Changes: - Bug fix: [Bugfix] fix helios video generate use cpu device ### vllm-omni-perf - Source: [PR #1915](vllm-project/vllm-omni#1915) - [Bugfix] fix helios video generate use cpu device - Changes: - Bug fix: [Bugfix] fix helios video generate use cpu device ### vllm-omni-audio-tts - Source: [PR #1913](vllm-project/vllm-omni#1913) - [Optim][Qwen3TTS][CodePredictor] support torch.compile with reduce-overhead and dynamic False ### vllm-omni-perf - Source: [PR #1913](vllm-project/vllm-omni#1913) - [Optim][Qwen3TTS][CodePredictor] support torch.compile with reduce-overhead and dynamic False ### vllm-omni-api - Source: [PR #1908](vllm-project/vllm-omni#1908) - [Entrypoint][Refactor] vLLM-Omni Entrypoint Refactoring ### vllm-omni-perf - Source: [PR #1908](vllm-project/vllm-omni#1908) - [Entrypoint][Refactor] vLLM-Omni Entrypoint Refactoring ### vllm-omni-contrib - Source: [PR #1908](vllm-project/vllm-omni#1908) - [Entrypoint][Refactor] vLLM-Omni Entrypoint Refactoring ### vllm-omni-serving - Source: [PR #1908](vllm-project/vllm-omni#1908) - [Entrypoint][Refactor] vLLM-Omni Entrypoint Refactoring ### vllm-omni-cicd - Source: [PR #1908](vllm-project/vllm-omni#1908) - [Entrypoint][Refactor] vLLM-Omni Entrypoint Refactoring ### vllm-omni-image-gen - Source: [PR #1900](vllm-project/vllm-omni#1900) - [Feat] support HSDP for Flux family - Changes: - New feature: [Feat] support HSDP for Flux family ### vllm-omni-contrib - Source: [PR #1900](vllm-project/vllm-omni#1900) - [Feat] support HSDP for Flux family - Changes: - New feature: [Feat] support HSDP for Flux family ### vllm-omni-distributed - Source: [PR #1898](vllm-project/vllm-omni#1898) - [Feature]: Remove some useless `hf_overrides` in yaml - Changes: - New feature: [Feature]: Remove some useless `hf_overrides` in yaml ### vllm-omni-quantization - Source: [PR #1898](vllm-project/vllm-omni#1898) - [Feature]: Remove some useless `hf_overrides` in yaml - Changes: - New feature: [Feature]: Remove some useless `hf_overrides` in yaml ### vllm-omni-cicd - Source: [PR #1898](vllm-project/vllm-omni#1898) - [Feature]: Remove some useless `hf_overrides` in yaml - Changes: - New feature: [Feature]: Remove some useless `hf_overrides` in yaml ### vllm-omni-perf - Source: [PR #1898](vllm-project/vllm-omni#1898) - [Feature]: Remove some useless `hf_overrides` in yaml - Changes: - New feature: [Feature]: Remove some useless `hf_overrides` in yaml ### vllm-omni-contrib - Source: [PR #1890](vllm-project/vllm-omni#1890) - [NPU] Upgrade to v0.17.0 ### vllm-omni-contrib - Source: [PR #1889](vllm-project/vllm-omni#1889) - Add `Governance` section - Changes: - New feature: Add `Governance` section ### vllm-omni-distributed - Source: [PR #1881](vllm-project/vllm-omni#1881) - [Feat] Support T5 Tensor Parallelism - Changes: - New feature: [Feat] Support T5 Tensor Parallelism ### vllm-omni-cicd - Source: [PR #1881](vllm-project/vllm-omni#1881) - [Feat] Support T5 Tensor Parallelism - Changes: - New feature: [Feat] Support T5 Tensor Parallelism

tjtanaa added 26 commits March 3, 2026 09:31

setup test amd ready

4394296

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

fix syntax

c3d327f

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

fix the commands

0244098

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

revert jinja; clean up test-amd-ready.yaml

ea6de96

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

try the command

9e82c04

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

fix the jinja issue

167eb53

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

fix the multiline issue

f3cc84b

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

resolve jinja issue

6facd38

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

fix the jinja bash command parsing issue

2ce1ef6

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

try to resolve the bootstrapped command syntax error

4a0afa6

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

fix EXIT syntax

593e33f

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

Merge remote-tracking branch 'origin/main' into setupreadymergeci

a4aaf3b

Merge remote-tracking branch 'origin/main' into setupreadymergeci

69dbebb

disable AITER as it is not shipped prebuilt; fix bagel tests

b71daa0

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

disable stable audio model ut; fix test_serve_cli test and qwen25omni…

abda29c

… test Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

disable aiter, and change diffusion gpu worker test to mi250

25b1c16

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

move some tests to mi250

6fbf447

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

add support to test-amd-merge

53765f3

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

increase timeout and add more jobs to mi250 queue

df285a9

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

point all tests back to mi325 machine

8d7c517

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

test merge yaml

6dec429

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

Merge remote-tracking branch 'origin/main' into setupreadymergeci

916c4b9

fix test qwen3 omni audio test

f478363

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

evaluate test-ready.yml after sync main

88308b3

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

complete the pr

b3bcaf7

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

sync with main

8801f26

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

tjtanaa requested a review from hsliuustc0106 as a code owner March 19, 2026 13:36

update bagel img2img expectation

a040792

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

chatgpt-codex-connector bot reviewed Mar 19, 2026

View reviewed changes

hsliuustc0106 approved these changes Mar 19, 2026

View reviewed changes

hsliuustc0106 added the ready label to trigger buildkite CI label Mar 19, 2026

hsliuustc0106 enabled auto-merge (squash) March 19, 2026 14:03

hsliuustc0106 merged commit 0239057 into vllm-project:main Mar 19, 2026
6 of 7 checks passed

zhumingjue138 pushed a commit to zhumingjue138/vllm-omni that referenced this pull request Mar 20, 2026

[CI] [ROCm] Setup test-ready.yml and test-merge.yml (vllm-project…

90c00a6

…#2017) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

yiliu30 pushed a commit to yiliu30/vllm-omni-fork that referenced this pull request Mar 20, 2026

[CI] [ROCm] Setup test-ready.yml and test-merge.yml (vllm-project…

15a8720

…#2017) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> Signed-off-by: yiliu30 <yi4.liu@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] [ROCm] Setup `test-ready.yml` and `test-merge.yml`#2017

[CI] [ROCm] Setup `test-ready.yml` and `test-merge.yml`#2017
hsliuustc0106 merged 27 commits intovllm-project:mainfrom
EmbeddedLLM:setupreadymergeci

tjtanaa commented Mar 19, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Uh oh!

hsliuustc0106 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tjtanaa commented Mar 19, 2026

Purpose

Test Plan

Test Result

Plan

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants