[CI][Perf] Add nightly PR labels, consolidate pipeline, and switch benchmark flag to --test-config-file by yenuo26 · Pull Request #2816 · vllm-project/vllm-omni

yenuo26 · 2026-04-15T08:01:44Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

fix #2410

Summary

Refactored Nightly CI orchestration: Merged the deprecated test-nightly-diffusion.yml content into test-nightly.yml, and refined the trigger logic based on tags (omni-test / tts-test / diffusion-x2iat-test / diffusion-x2v-test).
Updated Perf test entry and configuration: Migrated Omni configuration from test_omni.json to test_qwen_omni.json, and unified the benchmark parameter name to --test-config-file to avoid conflicts with pytest’s built-in --config-file.
Stabilized perf parameter parsing and collection: Moved the registration of the --test-config-file option down to tests/dfx/conftest.py to ensure pytest recognizes it during the parameter parsing phase; simultaneously cleaned up duplicate registrations in scripts.
Synchronized documentation and toolchain: Updated CI documentation examples, L4 performance test instructions, and naming references in nightly report scripts to ensure consistency among commands, file names, and configuration names.

Key Changes

CI:

.buildkite/pipeline.yml
.buildkite/test-nightly.yml
Deleted .buildkite/test-nightly-diffusion.yml
Perf tests:

tests/dfx/perf/scripts/run_benchmark.py
tests/dfx/perf/scripts/run_diffusion_benchmark.py
tests/dfx/conftest.py
Deleted tests/dfx/perf/tests/test.json
Added/using tests/dfx/perf/tests/test_qwen_omni.json
Docs & tooling:

docs/contributing/ci/CI_5levels.md
docs/contributing/ci/test_guide.md
docs/contributing/ci/test_examples/l4_performance_tests.inc.md
tools/nightly/generate_nightly_perf_excel.py

Why

To prevent --config-file from being interpreted as pytest’s own configuration parameter under pytest 9, which could cause rootdir shifts and collection exceptions such as ModuleNotFoundError: No module named 'tests'.
To simplify Nightly pipeline maintenance costs and reduce drift and duplication caused by scattered YAML files.
To ensure consistency in L4 perf paths, configuration naming, and documentation descriptions for Omni/TTS/Diffusion, thereby lowering troubleshooting costs.

Test Plan

1.Run perf test in local: pytest -s -v tests/dfx/perf/scripts/run_benchmark.py --test-config-file tests/dfx/perf/tests/test_tts.json
2.CI Nightly: Validate each of the tag paths (omni/tts/diffusion) once, triggering each one individually.

Test Result

2. tts-test:

omni-test:

diffusion-x2v-test:

diffusion-x2iat-test:

nightly-test:

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

chatgpt-codex-connector · 2026-04-15T08:01:51Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

hsliuustc0106 · 2026-04-15T09:27:59Z

BLOCKER scan:

Category	Result
Correctness	Minor: Removed explicit error checks in _resolve_baseline_value(). Old code provided clear ValueError/IndexError messages; new code relies on Python's default IndexError which is less informative. Consider keeping the explicit checks for better debugging.
Reliability/Safety	PASS
Breaking Changes	PASS
Test Coverage	PASS - Test results provided in PR description
Documentation	PASS - Documentation updated with new --config-file usage
Security	PASS

OVERALL: NO BLOCKERS

VERDICT: COMMENT

This is a straightforward CI infrastructure update. The changes to support --config-file for performance tests are useful, and the documentation is updated accordingly.

Minor suggestion: Consider keeping the explicit error checks in _resolve_baseline_value() for sweep_index validation. The old code provided clearer error messages that would help users debug configuration issues more quickly.

Overall, the PR is ready to merge once the blocked checks pass.

…models - Enhanced the nightly pipeline to include additional labels for triggering tests. - Removed the obsolete `test-nightly-diffusion.yml` file. - Updated `test-nightly.yml` to include new performance tests for Omni and TTS models. - Introduced new performance test configurations in `test_qwen_omni.json` and `test_tts.json`. - Added new benchmark scripts for Omni and diffusion models. - Updated documentation to reflect changes in performance test configurations. Signed-off-by: wangyu <410167048@qq.com> Co-authored-by: inaniloquentee <inani_@stu.xjtu.edu.cn>

…rmance testing of Omni and TTS models. - Updated the nightly pipeline configuration to reflect changes in test script names and parameters. - Introduced `run_diffusion_benchmark.py` for benchmarking diffusion models. - Adjusted documentation to align with new test script usage and configuration options. Signed-off-by: wangyu <410167048@qq.com> Co-authored-by: inaniloquentee <inani_@stu.xjtu.edu.cn>

…pr-label Signed-off-by: wangyu <410167048@qq.com>

Signed-off-by: wangyu <410167048@qq.com>

yenuo26 · 2026-04-15T10:48:25Z

I have modified some performance test scripts for the following reasons, @amy-why-3459 @fhfuih PTAL:

To separate the performance test scripts for omni and tts.
To prevent --config-file from being interpreted as pytest's own configuration parameter under pytest 9, which could cause rootdir shifts and collection exceptions such as ModuleNotFoundError: No module named 'tests'.

Signed-off-by: wangyu <410167048@qq.com>

fhfuih

For the diffusion part, generally looks good to me. Left some comments on documentation

fhfuih · 2026-04-16T01:27:11Z

        /tests/e2e/offline_inference/test_{model_name}_expansion.py<br>
        <strong>Performance:</strong><br>
-        /tests/dfx/perf/tests/test.json<br>
+        /tests/dfx/perf/tests/test_qwen_omni.json (Omni) and test_tts.json (TTS)<br>


And there is /tests/dfx/perf/tests/test_{some diffusion models}_vllm_omni.json Maybe you would like to mention them in the doc

It is also related to your change in docs/contributing/ci/test_examples/l4_performance_tests.inc.md and some changes below within this file

fhfuih · 2026-04-16T01:29:13Z

                                                   ├── test_cache_dit.py
                                                   ├── test_teacache.py
-                                                   ├── test_stable_audio_expansion.py
+                                                   ├── test_stable_audio_model.py


Is this unintentional? There isn't a test_stable_audio_model anymore after the L4 test for stable audio is merged

Signed-off-by: wangyu <410167048@qq.com>

…configurations Signed-off-by: wangyu <410167048@qq.com>

david6666666 · 2026-04-17T06:35:48Z

-                      path: /mnt/hf-cache
-                      type: DirectoryOrCreate
-
-      - label: ":full_moon: Diffusion · Qwen-Image · Accuracy Test"


You mistakenly deleted this nightly test. Please add it back.

…nchmark flag to --test-config-file (vllm-project#2816) Signed-off-by: wangyu <410167048@qq.com> Co-authored-by: Y. Fisher <yukexiong1@huawei.com> Co-authored-by: inaniloquentee <inani_@stu.xjtu.edu.cn>

yenuo26 requested a review from hsliuustc0106 as a code owner April 15, 2026 08:01

yenuo26 added the tts-test label to trigger buildkite tts models test in nightly CI label Apr 15, 2026

yenuo26 mentioned this pull request Apr 15, 2026

[CI] Add on-demand performance test trigger via PR comments #2506

Open

yenuo26 removed the tts-test label to trigger buildkite tts models test in nightly CI label Apr 15, 2026

yenuo26 force-pushed the pr-label branch from 3bc239e to af0b8b4 Compare April 15, 2026 09:56

yenuo26 force-pushed the pr-label branch from af0b8b4 to 91c7198 Compare April 15, 2026 10:00

Fishermanykx and others added 3 commits April 15, 2026 18:16

Merge branch 'pr-label' of https://github.com/yenuo26/vllm-omni into …

14c2f7f

…pr-label Signed-off-by: wangyu <410167048@qq.com>

Refactor benchmark configuration handling

8c47379

Signed-off-by: wangyu <410167048@qq.com>

yenuo26 added the tts-test label to trigger buildkite tts models test in nightly CI label Apr 15, 2026

yenuo26 changed the title ~~[CI] Update nightly pipeline configuration and remove deprecated testfiles~~ [CI][Perf] Add nightly PR labels, consolidate pipeline, and switch benchmark flag to --test-config-file Apr 15, 2026

yenuo26 added omni-test label to trigger buildkite omni model test in nightly CI and removed tts-test label to trigger buildkite tts models test in nightly CI labels Apr 15, 2026

Increase stage initialization timeout in benchmark script

263b7c5

Signed-off-by: wangyu <410167048@qq.com>

Update nightly pipeline configuration for Diffusion X2I performance test

4ed64f1

Signed-off-by: wangyu <410167048@qq.com>

Merge branch 'main' into pr-label

fd10142

fhfuih reviewed Apr 16, 2026

View reviewed changes

Update performance test configurations in test_qwen_omni.json

6aa696d

Signed-off-by: wangyu <410167048@qq.com>

Enhance performance testing documentation to include diffusion model …

c95af15

…configurations Signed-off-by: wangyu <410167048@qq.com>

hsliuustc0106 merged commit c83f664 into vllm-project:main Apr 16, 2026
8 checks passed

yenuo26 mentioned this pull request Apr 16, 2026

[RFC]: CI optimization and supplementary task tracking JiusiServe/vllm-omni#177

Open

19 tasks

david6666666 reviewed Apr 17, 2026

View reviewed changes

yenuo26 deleted the pr-label branch April 21, 2026 01:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI][Perf] Add nightly PR labels, consolidate pipeline, and switch benchmark flag to --test-config-file#2816

[CI][Perf] Add nightly PR labels, consolidate pipeline, and switch benchmark flag to --test-config-file#2816
hsliuustc0106 merged 9 commits intovllm-project:mainfrom
yenuo26:pr-label

yenuo26 commented Apr 15, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented Apr 15, 2026

Uh oh!

hsliuustc0106 commented Apr 15, 2026

Uh oh!

yenuo26 commented Apr 15, 2026 •

edited

Loading

Uh oh!

fhfuih left a comment

Uh oh!

fhfuih Apr 16, 2026

Uh oh!

fhfuih Apr 16, 2026

Uh oh!

yenuo26 Apr 16, 2026

Uh oh!

fhfuih Apr 16, 2026

Uh oh!

yenuo26 Apr 16, 2026

Uh oh!

Uh oh!

david6666666 Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

yenuo26 commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Summary

Key Changes

Why

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector Bot commented Apr 15, 2026

Uh oh!

hsliuustc0106 commented Apr 15, 2026

Uh oh!

yenuo26 commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fhfuih left a comment

Choose a reason for hiding this comment

Uh oh!

fhfuih Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

fhfuih Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

yenuo26 Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

fhfuih Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

yenuo26 Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

david6666666 Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

yenuo26 commented Apr 15, 2026 •

edited

Loading

yenuo26 commented Apr 15, 2026 •

edited

Loading