Skip to content

[Docs][CI] doc update & L4 example test for text-to-image page#1910

Merged
wtomin merged 2 commits into
vllm-project:mainfrom
fhfuih:test-example
Mar 23, 2026
Merged

[Docs][CI] doc update & L4 example test for text-to-image page#1910
wtomin merged 2 commits into
vllm-project:mainfrom
fhfuih:test-example

Conversation

@fhfuih
Copy link
Copy Markdown
Contributor

@fhfuih fhfuih commented Mar 16, 2026

Purpose

As per #1244 , there is the need to update example documentation.

Also to follow the recent establishment of multi-level testing system, this PR adds L4 test for all the example code snippets appeared on "Online Serving/Text to Image" and "Offline Inference/Text to Image" doc pages. It can also serve as an example for future contribution of other example tests.

Test Plan

Methods

Per offline discussion with @yenuo26 and @wtomin , the offline version tries to dynamically extract all python and bash code blocks in the markdown file and directly run them.

Meanwhile, this approach is currently not used in the online version, because we have to additionally tell server-launching scripts from client-request-sending scripts; the code can become way messier. Hence, the online version copies the codes from the code blocks. (More flexible for complex setup, but requires future doc changes to sync with test script)

Runtime config

Currently, we decide to try not lowering down num_inference_steps in examples. Thus, a test script may run 50 inference steps. This is to fully recreate real-world scenarios of running these examples.

Test naming rule

NOTE, open to discussion: Due to the different approaches in online and offline tests (elaborated above), their test cases have different naming conventions

  • The offline tests, dynamically extracting test scripts, only benefits together with pytest parametrization. So the naming has to be test_{the name of only one test function}[distinguisher in parametrization ID]
  • The online tests, hand-written & copied from corresponding code blocks, has to be defined in separate test functions, hence the naming has to be test_{distinguisher in function name}[only a dummy parametrization ID for omni_server dependency injection]
> pytest tests/examples/offline_inference/test_text_to_image.py tests/examples/online_serving/test_text_to_image.py --collect-only

<Dir vllm-omni>
  <Package tests>
    <Dir examples>
      <Package offline_inference>
        <Module test_text_to_image.py>
          <Function test_text_to_image[basic_usage_001]>
          <Function test_text_to_image[basic_usage_002]>
          <Function test_text_to_image[basic_usage_003]>
          <Function test_text_to_image[local_cli_usage_001]>
          <Function test_text_to_image[local_cli_usage_002]>
          <Function test_text_to_image[local_cli_usage_003]>
          <Function test_text_to_image[lora_001]>
          <Function test_text_to_image[web_ui_demo_001]>
      <Package online_serving>
        <Module test_text_to_image.py>
          <Function test_api_calls_001[omni_server0]>
          <Function test_api_calls_002[omni_server0]>
          <Function test_lora_001[omni_server0]>
          <Function test_api_calls_003>
          <Function test_lora_002>

Exclusion Rule

  • Gradio scripts are excluded from this test
    • Example: test_api_calls_003
  • tests that largely overlap other existing tests may be excluded
    • Example: test_lora_002

Output folder structure

Three-layer folder structure:

  • root output dir: optionally set an OUTPUT_DIR env variable, otherwise pytest auto-creates one under /tmp
  • doc page dir: manually set a global variable in each test_XXX.py file, can include opinionated abbreviations. Example: example_offline_t2i
  • test case dir: should be the same as the test case name, e.g., basic_usage_001. This is automatically done when dynamically extracting code from markdown. But for the case of manually copying code content, still need to pay attention to this.
  • relevant files produced by the script, such as output.png. (The dynamically extracted python scripts are also saved here)

Example file output:

├── example_offline_t2i
│   ├── basic_usage_001
│   │   ├── coffee.png
│   │   └── snippet.py
│   ├── basic_usage_002
│   │   ├── 0.jpg
│   │   ├── 1.jpg
│   │   ├── 2.jpg
│   │   └── snippet.py
│   ├── basic_usage_003
│   │   ├── 0.jpg
│   │   ├── 1.jpg
│   │   └── snippet.py
│   └── local_cli_usage_001
│       └── outputs
│           └── coffee.png
└── example_online_t2i
    ├── api_calls_001
    │   └── api_calls_001.png
    ├── api_calls_002
    │   └── api_calls_002.png
    └── lora_001
        └── lora_001.png

Test Result

  • Passed on my side
  • See the bottom comment for CI results

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

@fhfuih fhfuih requested a review from hsliuustc0106 as a code owner March 16, 2026 06:48
Copilot AI review requested due to automatic review settings March 16, 2026 06:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds L4 “examples” tests for the text-to-image documentation (online serving + offline inference), and introduces shared helpers/fixtures to extract and execute README code blocks.

Changes:

  • Add online serving text-to-image example tests and offline inference tests driven by README snippet extraction.
  • Introduce tests/examples/conftest.py with shared output-dir fixture, README snippet extraction, and subprocess helpers.
  • Refactor existing online serving Qwen Omni example tests to reuse shared helpers; update docs/snippets accordingly.

Reviewed changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/examples/online_serving/test_text_to_image.py New online-serving text-to-image example tests (curl + Python client + LoRA).
tests/examples/offline_inference/test_text_to_image.py New offline text-to-image example tests that execute extracted README snippets and validate outputs.
tests/examples/conftest.py New shared infra: output dir fixture, README snippet parsing, runner, subprocess + parsing helpers.
tests/examples/online_serving/test_qwen3_omni.py Remove local helper duplication; import shared helpers; align module markers/docstring.
tests/examples/online_serving/test_qwen2_5_omni.py Same refactor as qwen3_omni.
pyproject.toml Add mistune dependency for README AST parsing in example tests.
examples/offline_inference/text_to_image/README.md Fix example code (don’t assign .save() result) and replace non-ASCII comma characters.
docs/contributing/ci/tests_style.md Document examples→tests mapping in the test style guide.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread tests/examples/online_serving/test_text_to_image.py
Comment thread tests/examples/online_serving/test_text_to_image.py Outdated
Comment thread tests/examples/conftest.py Outdated
Comment thread tests/examples/conftest.py
Comment thread tests/examples/conftest.py Outdated
Comment thread tests/examples/conftest.py Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc23c952e7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread tests/examples/online_serving/test_qwen2_5_omni.py
@wtomin
Copy link
Copy Markdown
Collaborator

wtomin commented Mar 16, 2026

How to trigger this L4 text-to-image example test temporally on the CI machine? I think H100 machine can accomodate the VRAM requirements of majority of models.

@fhfuih fhfuih changed the title L4 test for text-to-image doc example code (online+offline) [Docs][CI] doc update & L4 example test for text-to-image page Mar 16, 2026
Comment thread examples/offline_inference/text_to_image/README.md
@fhfuih
Copy link
Copy Markdown
Contributor Author

fhfuih commented Mar 16, 2026

How to trigger this L4 text-to-image example test temporally on the CI machine? I think H100 machine can accomodate the VRAM requirements of majority of models.

I just edited the pipeline yamls to temporarily do so. Could you help add a ready tag to trigger CI?

@wtomin wtomin added the ready label to trigger buildkite CI label Mar 16, 2026
@wtomin
Copy link
Copy Markdown
Collaborator

wtomin commented Mar 16, 2026

When it is ready to merge (which I suppose should be after 3/17), please let me know. @fhfuih

Comment thread examples/offline_inference/text_to_image/README.md Outdated
Comment thread examples/offline_inference/text_to_image/README.md Outdated
Comment thread examples/offline_inference/text_to_image/README.md Outdated
Comment thread examples/offline_inference/text_to_image/README.md Outdated
Use one of the following patterns depending on page type:

- **Dynamic code-block extraction (preferred for offline docs)**
- Extract Python/Bash code blocks from markdown AST analyzer, then execute them directly in tests. See [https://github.com/vllm-project/vllm-omni/blob/main/tests/examples/offline_inference/test_text_to_image.py](https://github.com/vllm-project/vllm-omni/blob/main/tests/examples/offline_inference/test_text_to_image.py) for reference implementation.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can also briefly explain here how to write such test cases and what parameters ExampleRunner support.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Please check again

Comment thread examples/offline_inference/text_to_image/README.md
Comment thread examples/offline_inference/text_to_image/README.md Outdated
Comment thread examples/offline_inference/text_to_image/README.md
Comment thread examples/offline_inference/text_to_image/README.md Outdated
Comment thread examples/offline_inference/text_to_image/README.md Outdated
Comment thread examples/offline_inference/text_to_image/README.md Outdated
@hsliuustc0106
Copy link
Copy Markdown
Collaborator

conflicts

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is to explicitly exclude the newly add .inc.md file.

Previously without this file, all *.md files in this folder are implicitly added to doc. Now that L4 tests receive additional guides, I put them in separate subfiles to avoid cluttering CI_5levels.md. This is the same design pattern as docs/getting_started/installation/...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from adding doc test guides to L4 test documentation, I also change the previous <details><summary>... fold block to MkDocs-native ???+ example ... fold block syntax. Everything in the former block is expected to be HTML, not markdown, so all formatting is lost. The original content in this block is only indented

@fhfuih
Copy link
Copy Markdown
Contributor Author

fhfuih commented Mar 19, 2026

All tests passed, with one exception: FLUX.2-dev seems to fail in CI machine, reporting missing model_index.json. Maybe because FLUX require additional user agreement and huggingface somehow bans the resource loading---although there is no Timeout or Auth -related error.

After skipping this case and adding a TODO note in the test script, all other tests work fine.

Please see the updated comment below

@hsliuustc0106 @wtomin @yenuo26 @congw729 PTAL. I also added some self-comment (annotations) to some of my file changes above, for the sake of clarification.

Comment thread .buildkite/test-nightly.yml
Copy link
Copy Markdown
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

This PR establishes a solid foundation for L4 documentation example testing with a well-designed approach:

  • Dynamic extraction from markdown for offline tests keeps docs and tests synchronized
  • Copied scripts for online tests handles server/client complexity appropriately
  • Comprehensive documentation for future contributors in doc_example_tests.inc.md

✅ Validated

  • Gate checks passing (DCO, pre-commit, mergeable, build, CI)
  • Test infrastructure design follows good patterns
  • Documentation updates are appropriate and complete
  • PR description includes clear test plan and naming conventions
  • Skip conditions for known issues (FLUX.2-dev, Web UI Demo) are reasonable

📝 Minor (non-blocking)

  • PR description checklist has docs unchecked but docs are updated - minor inconsistency
  • write_zimage_lora duplication noted as TODO in code

Good work on establishing this testing pattern for the project!

Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
@fhfuih
Copy link
Copy Markdown
Contributor Author

fhfuih commented Mar 19, 2026

With recent rebase, the HF token issue is fixed, and now FLUX model can run as well. All tests pass in this CI: https://buildkite.com/vllm/vllm-omni/builds/4472/steps/canvas?sid=019d056d-120a-4acb-9924-3457e40176e5

The temporary modifications to CI pipelines have been reverted

Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
@fhfuih
Copy link
Copy Markdown
Contributor Author

fhfuih commented Mar 23, 2026

Fixed conflict in doc. Not affecting previous test results

@wtomin wtomin merged commit 8a58394 into vllm-project:main Mar 23, 2026
7 of 8 checks passed
hongyi-zhang pushed a commit to hongyi-zhang/vllm-omni that referenced this pull request Mar 23, 2026
…project#1910)

Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: hongyi.zhang <hongyi.zhang@bytedance.com>
hongyi-zhang pushed a commit to hongyi-zhang/vllm-omni that referenced this pull request Mar 23, 2026
…project#1910)

Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
@fhfuih fhfuih deleted the test-example branch April 28, 2026 02:37
clodaghwalsh17 pushed a commit to clodaghwalsh17/nm-vllm-omni-ent that referenced this pull request May 12, 2026
…project#1910)

Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants