[CI] Update test markers and configurations to use 'full_model' for L4 nightly tests by yenuo26 · Pull Request #2641 · vllm-project/vllm-omni

yenuo26 · 2026-04-09T08:56:23Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Update test markers and configurations to use 'full_model' for L4 nightly tests

Changed test markers from 'advanced_model' to 'full_model' across various test files to align with the new testing structure.
Updated the 'pyproject.toml' to reflect the new marker definitions.
Adjusted Buildkite configurations to run full model tests in nightly pipelines.
Enhanced documentation to clarify the use of 'full_model' for nightly tests and 'advanced_model' for merge tests.

Test Plan

run in ci

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

…4 nightly tests - Changed test markers from 'advanced_model' to 'full_model' across various test files to align with the new testing structure. - Updated the 'pyproject.toml' to reflect the new marker definitions. - Adjusted Buildkite configurations to run full model tests in nightly pipelines. - Enhanced documentation to clarify the use of 'full_model' for nightly tests and 'advanced_model' for merge tests. Signed-off-by: wangyu <410167048@qq.com>

chatgpt-codex-connector · 2026-04-09T08:56:28Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

lishunyang12 · 2026-04-11T17:33:33Z

        commands:
          - export VLLM_WORKER_MULTIPROC_METHOD=spawn
-          - pytest -s -v tests/e2e/online_serving/test_*_expansion.py -k "not test_wan22_expansion and not test_wan_2_1_vace_expansion and not test_qwen_image" -m "advanced_model and diffusion and H100" --run-level "advanced_model"
+          - pytest -s -v tests/e2e/ -k "not test_wan22_expansion and not test_wan_2_1_vace_expansion and not test_qwen_image" -m "full_model and diffusion and H100" --run-level "full_model"


Broadening from tests/e2e/online_serving/test_*_expansion.py to tests/e2e/ now sweeps in tests/e2e/accuracy/test_gebench_h100_smoke.py, test_gedit_bench_h100_smoke.py, and tests/e2e/accuracy/wan22_i2v/* — they all match full_model and diffusion and H100 after this PR, and the -k filter only excludes test_wan22_expansion/test_wan_2_1_vace_expansion/test_qwen_image, not the accuracy files. Those have dedicated steps below and need --gebench-model/--gedit-model CLI args, so they'll either double-run or fail here. Please tighten the path back to tests/e2e/online_serving/ or extend the -k exclusion.

lishunyang12

Review Summary

This PR cleanly separates the L3 (merge) and L4 (nightly) test tiers by introducing a new full_model marker for L4 nightly tests, keeping advanced_model exclusively for L3 merge tests. The change is well-structured and consistently applied.

What looks good

_is_deep_run_level() helper in conftest.py -- Good abstraction. Centralizing the run_level in ("advanced_model", "full_model") check avoids scattering the logic and makes future level additions trivial.
Marker registration in pyproject.toml -- Correctly adds full_model and updates the --run-level choices to include it.
Buildkite pipeline updates -- All nightly YAMLs consistently switch from advanced_model to full_model in both -m and --run-level flags.
Test path widening (e.g. tests/e2e/online_serving/test_*_expansion.py -> tests/e2e/) -- This is safe because the marker filter (-m "full_model and diffusion and H100") still constrains collection. It also picks up the accuracy tests that were migrated to full_model.
Documentation updates -- CI level docs, marker docs, and READMEs are all updated consistently.

Minor observations (non-blocking)

test_qwen3_tts_base_expansion.py has both @pytest.mark.full_model and @pytest.mark.core_model on the same tests. This appears intentional (tests that run at both PR and nightly levels), but it might be worth a brief comment in the file explaining why they carry dual markers, for future contributors.
Benchmark tests (run_benchmark.py, run_diffusion_benchmark.py) gained full_model + benchmark markers. Previously they had no level marker at all. This is a good addition that brings them into the marker system properly.
The GPU cleanup log message change ("GPU cleanup disabled" -> "\nPost-test GPU cleanup skipped...") is unrelated to the marker refactor. Not a problem, just noting it's bundled in.

LGTM. The separation between merge-level and nightly-level test markers is clear and consistently applied across all 36 changed files.

lishunyang12 · 2026-04-16T15:07:50Z

Solve conflict thanks.

Signed-off-by: wangyu <410167048@qq.com>

…4-mark

…istent pytestmark usage across various test modules. Signed-off-by: wangyu <410167048@qq.com>

Signed-off-by: wangyu <410167048@qq.com>

…racy tests; enhance run_args.py to include 'full_model' in run-level options. Signed-off-by: wangyu <410167048@qq.com>

…s_base_expansion.py to streamline test definitions. Signed-off-by: wangyu <410167048@qq.com>

…line CI pipeline. Signed-off-by: wangyu <410167048@qq.com>

…mpts, update request configurations, and streamline audio transcription process. Adjust pytestmark for diffusion tests. Signed-off-by: wangyu <410167048@qq.com>

…T2S prompts, adjust pytestmark for omni tests, and enhance audio validation logic in assertions. Signed-off-by: wangyu <410167048@qq.com>

…attribute with direct cosine similarity calculation, ensuring more accurate audio-text comparison. Clean up unused similarity variable in runtime handling. Signed-off-by: wangyu <410167048@qq.com>

yenuo26 · 2026-04-22T00:33:51Z

Solve conflict thanks.

fixed

yenuo26 · 2026-04-22T06:14:28Z

@hsliuustc0106 @Gaohan123 @gcanlin This PR is ready. Could you please check if it can be merged?

gcanlin · 2026-04-22T06:28:21Z

From the semantics, does full_model mean nightly will run all models or run all tests of one model?

yenuo26 · 2026-04-22T06:33:45Z

From the semantics, does full_model mean nightly will run all models or run all tests of one model?

Our design is that, except for the simple test cases of L2 & L3, L4 will run the full set of test cases for all high, medium, and low priority models (excluding the test cases already run by L2 and L3). Therefore, it refers to both running all models and running all tests.
core_model -> L1&L2
advanced_model -> L3
full_model -> L4

…4 nightly tests (vllm-project#2641) Signed-off-by: wangyu <410167048@qq.com>

yenuo26 requested a review from hsliuustc0106 as a code owner April 9, 2026 08:56

yenuo26 mentioned this pull request Apr 9, 2026

[RFC]: CI optimization and supplementary task tracking JiusiServe/vllm-omni#177

Open

12 tasks

yenuo26 added the nightly-test label to trigger buildkite nightly test CI label Apr 9, 2026

Merge branch 'main' into L4-mark

a03cc6b

lishunyang12 reviewed Apr 11, 2026

View reviewed changes

lishunyang12 approved these changes Apr 16, 2026

View reviewed changes

yenuo26 and others added 8 commits April 20, 2026 20:48

Merge remote-tracking branch 'upstream/main' into L4-mark

5be3035

Signed-off-by: wangyu <410167048@qq.com>

Merge branch 'L4-mark' of https://github.com/yenuo26/vllm-omni into L…

8edb358

…4-mark

Refactor test files to remove unnecessary blank lines and ensure cons…

69799ba

…istent pytestmark usage across various test modules. Signed-off-by: wangyu <410167048@qq.com>

remove blank

f394f01

Signed-off-by: wangyu <410167048@qq.com>

Merge branch 'main' into L4-mark

1b31cdf

Update test-nightly.yml to use broader test directory and ignore accu…

1e9ea0f

…racy tests; enhance run_args.py to include 'full_model' in run-level options. Signed-off-by: wangyu <410167048@qq.com>

Remove redundant pytest.mark.core_model decorators from test_qwen3_tt…

999e849

…s_base_expansion.py to streamline test definitions. Signed-off-by: wangyu <410167048@qq.com>

Remove Audio Generation Model Test step from test-ready.yml to stream…

9a867a7

…line CI pipeline. Signed-off-by: wangyu <410167048@qq.com>

yenuo26 removed the nightly-test label to trigger buildkite nightly test CI label Apr 21, 2026

yenuo26 and others added 2 commits April 21, 2026 15:24

Refactor audio handling in Dynin-Omni tests: replace T2S with T2A pro…

cf6d5af

…mpts, update request configurations, and streamline audio transcription process. Adjust pytestmark for diffusion tests. Signed-off-by: wangyu <410167048@qq.com>

Merge branch 'main' into L4-mark

57e2f3b

yenuo26 force-pushed the L4-mark branch from d1b346d to 1309e08 Compare April 21, 2026 09:20

Refactor Dynin-Omni test prompts and request handling: update T2A to …

58014ee

…T2S prompts, adjust pytestmark for omni tests, and enhance audio validation logic in assertions. Signed-off-by: wangyu <410167048@qq.com>

yenuo26 force-pushed the L4-mark branch from 1309e08 to 58014ee Compare April 21, 2026 10:04

yenuo26 added the nightly-test label to trigger buildkite nightly test CI label Apr 21, 2026

Refactor audio response validation in assertions: replace similarity …

d8075cb

…attribute with direct cosine similarity calculation, ensuring more accurate audio-text comparison. Clean up unused similarity variable in runtime handling. Signed-off-by: wangyu <410167048@qq.com>

yenuo26 added ready label to trigger buildkite CI and removed nightly-test label to trigger buildkite nightly test CI labels Apr 22, 2026

Merge branch 'main' into L4-mark

70c8617

yenuo26 mentioned this pull request Apr 22, 2026

[Tests]Add accuracy benchmark L4 test cases #2843

Merged

5 tasks

hsliuustc0106 merged commit e18cb89 into vllm-project:main Apr 22, 2026
8 checks passed

qinganrice pushed a commit to qinganrice/vllm-omni that referenced this pull request Apr 23, 2026

[CI] Update test markers and configurations to use 'full_model' for L…

f1af3af

…4 nightly tests (vllm-project#2641) Signed-off-by: wangyu <410167048@qq.com>

lengrongfu pushed a commit to lengrongfu/vllm-omni that referenced this pull request May 1, 2026

[CI] Update test markers and configurations to use 'full_model' for L…

1127747

…4 nightly tests (vllm-project#2641) Signed-off-by: wangyu <410167048@qq.com>

yenuo26 deleted the L4-mark branch May 9, 2026 01:07

clodaghwalsh17 pushed a commit to clodaghwalsh17/nm-vllm-omni-ent that referenced this pull request May 12, 2026

[CI] Update test markers and configurations to use 'full_model' for L…

ae5eb4a

…4 nightly tests (vllm-project#2641) Signed-off-by: wangyu <410167048@qq.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] Update test markers and configurations to use 'full_model' for L4 nightly tests#2641

[CI] Update test markers and configurations to use 'full_model' for L4 nightly tests#2641
hsliuustc0106 merged 15 commits into
vllm-project:mainfrom
yenuo26:L4-mark

yenuo26 commented Apr 9, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented Apr 9, 2026

Uh oh!

lishunyang12 Apr 11, 2026

Uh oh!

yenuo26 Apr 22, 2026

Uh oh!

lishunyang12 left a comment

Uh oh!

lishunyang12 commented Apr 16, 2026

Uh oh!

yenuo26 commented Apr 22, 2026

Uh oh!

yenuo26 commented Apr 22, 2026

Uh oh!

gcanlin commented Apr 22, 2026

Uh oh!

yenuo26 commented Apr 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

yenuo26 commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector Bot commented Apr 9, 2026

Uh oh!

lishunyang12 Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

yenuo26 Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

lishunyang12 left a comment

Choose a reason for hiding this comment

Review Summary

What looks good

Minor observations (non-blocking)

Uh oh!

lishunyang12 commented Apr 16, 2026

Uh oh!

yenuo26 commented Apr 22, 2026

Uh oh!

yenuo26 commented Apr 22, 2026

Uh oh!

gcanlin commented Apr 22, 2026

Uh oh!

yenuo26 commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yenuo26 commented Apr 9, 2026 •

edited

Loading

yenuo26 commented Apr 22, 2026 •

edited

Loading