Skip to content

[CI] Restructure vLLM-Omni Test Layout, Fixture Scope, and Support Modules#2620

Merged
hsliuustc0106 merged 29 commits intovllm-project:mainfrom
yenuo26:conftest
Apr 20, 2026
Merged

[CI] Restructure vLLM-Omni Test Layout, Fixture Scope, and Support Modules#2620
hsliuustc0106 merged 29 commits intovllm-project:mainfrom
yenuo26:conftest

Conversation

@yenuo26
Copy link
Copy Markdown
Collaborator

@yenuo26 yenuo26 commented Apr 9, 2026

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

due to #2299
Background and Goals
The original root directory tests/conftest.py centralized a large number of fixtures, assertions, media, and runtime logic, which could easily lead to:

  • Premature loading of heavy dependencies (vLLM / vllm_omni) during the pytest plugin loading phase, causing conflicts with session-level autouse fixtures (such as environment variables);
  • Difficulty in reusing test helper code and splitting responsibilities;
  • Mixing capabilities like hardware_test in tests/utils.py, blurring the boundary with code "for testing only."
  • Media is regenerated every time, which takes a lot of time.
  • In the Buildkite log, the test run output and the final result summary are not separated by a foldable group, which makes them hard to read.

This PR aims to thin the entry point, move implementations outward, modularize fixtures as plugins, and clarify import path instructions in the documentation.

Overview of Main Changes

Category Description
Root conftest.py 1)Only responsible for: pytest_plugins registration, backward-compatible re-exports from tests.helpers.*, and lazy loading of runtime symbols via __getattr__ to avoid immediately importing tests.helpers.runtime when conftest loads.
2)Add '--- Running Summary' before the pytest summary to create a foldable group in Buildkite.
tests/helpers/ 1)Reusable helper implementations: assertions, env, media, mark, process, runtime, stage_config, etc.; __init__.py deliberately avoids star-import aggregation to prevent altering import order.
2)Refactor media helper functions to support caching of synthetic audio, video, and image generation.
tests/helpers/fixtures/ Loaded via pytest_plugins: env, log, run_args, runtime; runtime fixtures then import OmniServer / OmniRunner internally, delaying heavy dependency initialization.
tests/utils.py hardware_test / hardware_marks forwarded to tests.helpers.mark
Subpackage conftest Local conftest files retained as needed in tests/e2e/accuracy, tests/examples, etc., separating responsibilities from the root directory.
Numerous tests and scripts Unified import adjustments (approx. 106 files) to align calling paths with the above structure.
CI documentation Examples and instructions under docs/contributing/ci/ updated to reference tests.helpers.mark, etc., consistent with the implementation.
assertions / media Enhanced error handling and logging (corresponding commit: Enhance error handling and logging in assertion and media helper functions).
Pre-commit tools/pre_commit/check_pickle_imports.py and others aligned with the new module layout.
tests/
├── conftest.py                    # Thin entry: pytest_plugins + backward-compat re-exports + lazy runtime imports
├── helpers/                       # Shared importable helpers package (not tests/helpers.py at repo root)
│   ├── __init__.py
│   ├── assertions.py              # assert_* helpers (split from legacy monolithic conftest)
│   ├── env.py                     # env vars, GPU cleanup, device helpers
│   ├── mark.py                    # hardware_test / hardware_marks (replaces deleted tests/utils.py)
│   ├── media.py
│   ├── process.py
│   ├── runtime.py                 # OmniRunner / OmniServer / clients (heavy imports)
│   ├── stage_config.py
│   └── fixtures/
│       ├── __init__.py
│       ├── env.py                 # default_env, GPU cleanup autouse, session fixtures
│       ├── log.py
│       ├── run_args.py
│       └── runtime.py             # omni_server, etc.; imports runtime inside fixtures to preserve init order
├── e2e/
│   └── accuracy/
│       ├── conftest.py            # Local pytest hooks / fixtures for accuracy tests
│       └── helpers.py             # Helpers paired with this package’s conftest
├── examples/
│   ├── conftest.py
│   └── helpers.py                 # Helpers paired with example tests
├── dfx/
│   ├── helpers.py                 # Shared helpers for dfx scripts / stability & perf
│   ├── stability/
│   │   └── conftest.py            # Local conftest; uses helpers from ../helpers.py
│   └── perf/
│       └── scripts/               # Benchmark scripts (no conftest here)
├── comfyui/
│   └── conftest.py                # Local conftest only (no comfyui/helpers.py in tree)
├── diffusion/
│   ├── …
│   └── lora/
│       └── helpers.py             # LoRA test utilities (no lora/conftest.py)
├── engine/
├── model_executor/
└── …                              # Test modules; imports updated from legacy conftest/tests.utils to tests.helpers.*

Impact on Contributors (Migration Notes)
New code: Prefer from tests.helpers.mark import hardware_test, hardware_marks; use tests.helpers.env for GPU/environment-related utilities. The re-exports in tests.conftest are only for transitional compatibility.
OmniRunner / OmniServer, etc.: Can still be lazily loaded from tests.conftest, but long-term recommendation is to switch to from tests.helpers.runtime import ..., aligning with the direction noted in the PR.

Test Plan

  1. test ready CI in local
  2. test merge CI in local
  3. run in ci

Test Result

1.ready in local

Job name result
Simple Unit Test Passed
Voxtral TTS CUDA Unit Test Passed
Diffusion Model Test Passed
Diffusion Batching Test Passed
Custom Pipeline Test Passed
Diffusion Model CPU offloading Test Passed
Audio Generation Model Test Passed
Diffusion Cache Backend Test Passed
Diffusion Sequence Parallelism Test Passed
Diffusion GPU Worker Test Passed
Engine Test Passed
Omni Model Test Passed
Omni Model Test with H100 Passed
MiMo-Audio E2E Test with H100 Passed
Qwen3-TTS E2E Test Passed
OmniVoice E2E Test Passed
Voxtral-TTS E2E Test Passed
Bagel Text2Img Model Test with H100 Passed
Bagel Img2Img Model Test with H100 Passed
Bagel Online Serving Test with H100 Passed
CosyVoice3-TTS E2E Test Passed

2.merge in local

Job Name result
Simple Unit Test Passed
Diffusion Model Test Passed
Diffusion Images API LoRA E2E Passed
Diffusion Model CPU offloading Test Passed
Audio Generation Model Test Passed
Diffusion Cache Backend Test Passed
Diffusion Sequence Parallelism Test Passed
Diffusion Tensor Parallelism Test Passed
Diffusion GPU Worker Test Passed
Engine Test Passed
Omni Model Test Passed
Qwen3-TTS CustomVoice E2E Test Passed
Qwen3-TTS Base E2E Test Passed
Omni Model Test with H100 Passed
Diffusion Image Edit Test with H100 (1 GPU) Passed
Bagel Model Test with H100 (Real Weights) Passed
Voxtral-TTS E2E Test Passed

success in merge
488602a0-826d-4748-bec5-43012c48a9e2

summary log
b788fb01-b026-4441-b19d-712c5b025b4b


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

yenuo26 added 2 commits April 9, 2026 10:31
Signed-off-by: wangyu <410167048@qq.com>
…tions

Signed-off-by: wangyu <410167048@qq.com>
@yenuo26 yenuo26 requested a review from hsliuustc0106 as a code owner April 9, 2026 02:57
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@yenuo26 yenuo26 changed the title Conftest [CI] Restructure vLLM-Omni Test Layout, Fixture Scope, and Support Modules Apr 9, 2026
yenuo26 and others added 2 commits April 9, 2026 11:01
…, video, and image generation. Introduce parameters for cache directory and force regeneration, enhancing performance and usability. Remove deprecated save_to_file logic and improve error handling for media processing.

Signed-off-by: wangyu <410167048@qq.com>
@hsliuustc0106
Copy link
Copy Markdown
Collaborator

cc @Gaohan123 @tzhouam @princepride @lishunyang12 @linyueqian @ZeldaHuang @wtomin @SamitHuang PTAL for your tests ownership files

Comment thread tests/helpers/fixtures/env.py Outdated
import pytest
import torch

from tests.helpers.env import _run_post_test_cleanup, _run_pre_test_cleanup
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests.helpers.env imports vllm.platforms and vllm_omni.platforms at module level, so loading this plugin at conftest time still pulls them in before default_env runs. Tbh this partially defeats the RFC #2299 goal — consider moving these imports inside clean_gpu_memory_between_tests so helpers.env only loads after session fixtures.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread tests/helpers/env.py Outdated
print("=" * 80)


def _run_pre_test_cleanup(enable_force: bool = False) -> None:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are in __all__ and imported from fixtures/env.py and helpers/runtime.py, so they're effectively public. Drop the leading underscore?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

yenuo26 added 2 commits April 13, 2026 19:34
…orts of platform-specific modules until needed to ensure proper execution order of fixtures. Introduce a new function for forced GPU cleanup to streamline cleanup processes across different classes. Enhance memory monitoring logic for better clarity and performance during tests.

Signed-off-by: wangyu <410167048@qq.com>
Signed-off-by: wangyu <410167048@qq.com>
@yenuo26 yenuo26 added the merge-test label to trigger buildkite merge test CI label Apr 13, 2026
…ate test markers for better categorization in zimage_parallelism tests. Enhance GPU memory cleanup messages for improved debugging during test execution.

Signed-off-by: wangyu <410167048@qq.com>
@yenuo26 yenuo26 added ready label to trigger buildkite CI nightly-test label to trigger buildkite nightly test CI and removed merge-test label to trigger buildkite merge test CI ready label to trigger buildkite CI labels Apr 14, 2026
yenuo26 added 3 commits April 15, 2026 20:39
…e count discrepancies in MP4 validation. Update docstring for clarity on expected behavior and adjust frame count assertion logic.

Signed-off-by: wangyu <410167048@qq.com>
… helpers from the runtime module. This change improves code organization and maintainability by consolidating cleanup functions under a common namespace.

Signed-off-by: wangyu <410167048@qq.com>
…hem to a new helpers module. This change improves code organization and reusability, while also adding functionality to compute and assert SSIM and PSNR metrics for model outputs. The previous utility functions have been removed from the utils module to streamline the codebase.

Signed-off-by: wangyu <410167048@qq.com>
if params.use_omni and params.stage_init_timeout is not None:
server_args = [*server_args, "--stage-init-timeout", str(params.stage_init_timeout)]
else:
server_args = [*server_args, "--stage-init-timeout", "600"]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the non-omni path too: when use_omni=False, vllm_omni.entrypoints.cli.main forwards straight to upstream vllm_main() unless --omni is present, so --stage-init-timeout / --init-timeout are not recognized there. We still have use_omni=False coverage in tests/e2e/accuracy/conftest.py, so this will make those server launches fail. Can we keep these timeout args gated behind params.use_omni, like the old fixture did?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

…ompatibility with non-omni paths. Timeout flags are now gated behind the use_omni parameter, aligning with legacy behavior and improving code clarity.

Signed-off-by: wangyu <410167048@qq.com>
@Gaohan123 Gaohan123 added this to the v0.20.0 milestone Apr 16, 2026
yenuo26 added 2 commits April 16, 2026 16:17
…ding

Signed-off-by: wangyu <410167048@qq.com>
Signed-off-by: wangyu <410167048@qq.com>
@yenuo26 yenuo26 added omni-test label to trigger buildkite omni model test in nightly CI and removed nightly-test label to trigger buildkite nightly test CI labels Apr 16, 2026
…ameter mapping functions

Signed-off-by: wangyu <410167048@qq.com>
Copy link
Copy Markdown
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid refactor overall. The modular split and media caching are clear improvements. A few issues to address.

Comment thread tests/conftest.py
# Marker for Buildkite log folding before pytest summary lines.
terminalreporter.write_sep("-", "Result Summary")


Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This eager import of tests.helpers.assertions pulls in transformers.pipeline at conftest load time (see assertions.py:12). The PR description says the root conftest should avoid "premature loading of heavy dependencies" — this defeats that goal. Either defer these re-exports via __getattr__ (like the runtime exports below), or move the transformers.pipeline import inside _load_gender_pipeline() in assertions.py.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


import numpy as np
import soundfile as sf
from PIL import Image
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from transformers import pipeline at module level means any import of tests.helpers.assertions (including from tests.conftest) triggers a heavy transformers load. Move this into _load_gender_pipeline() — it's only used there and already guarded by a lazy singleton pattern.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread pyproject.toml
"H100: Tests that require H100 GPU",
"L4: Tests that require L4 GPU",
"B60: Tests that require B60",
"MI325: Tests that require MI325 GPU (AMD/ROCm)",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate B60 marker entry. Line 197 adds "B60: Tests that require B60" (no description) and line 199 already has "B60: Tests that require Intel Arc Pro B60 XPU". pytest may silently accept duplicates, but this is confusing. Remove the one added here.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

env_dict=params.env_dict,
use_omni=params.use_omni,
)
if port
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The with OmniServer(...) if port else OmniServer(...) pattern duplicates the entire constructor call just to optionally pass port. Simplify to:

kwargs = dict(model=model, server_args=server_args, env_dict=params.env_dict, use_omni=params.use_omni)
if port:
    kwargs["port"] = port
with OmniServer(**kwargs) as server:

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@yenuo26 yenuo26 added ready label to trigger buildkite CI nightly-test label to trigger buildkite nightly test CI merge-test label to trigger buildkite merge test CI and removed omni-test label to trigger buildkite omni model test in nightly CI labels Apr 17, 2026
yenuo26 and others added 7 commits April 17, 2026 11:18
Signed-off-by: wangyu <410167048@qq.com>
… advanced model

Signed-off-by: wangyu <410167048@qq.com>
Signed-off-by: wangyu <410167048@qq.com>
- Removed unused export `dummy_messages_from_mix_data` from `_STAGE_CONFIG_EXPORT_NAMES`.
- Added `dummy_messages_from_mix_data` to the imports in multiple test files for consistency.
- Adjusted the `_REPO_ROOT` path comment in `stage_config.py` for clarity.

Signed-off-by: wangyu <410167048@qq.com>
Signed-off-by: wangyu <410167048@qq.com>
@hsliuustc0106 hsliuustc0106 merged commit 8a9add1 into vllm-project:main Apr 20, 2026
7 of 9 checks passed
@yenuo26 yenuo26 deleted the conftest branch April 21, 2026 01:38
qinganrice pushed a commit to qinganrice/vllm-omni that referenced this pull request Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-test label to trigger buildkite merge test CI nightly-test label to trigger buildkite nightly test CI ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants