[Config Refactor 3a/N] Image diffusion pipeline configs by lishunyang12 · Pull Request #2987 · vllm-project/vllm-omni

lishunyang12 · 2026-04-21T10:20:08Z

Summary

Continuation of RFC #2072 and follow-up to #2383 (2/N) and #2915 (2.5/N).

Migrates 16 single-stage image diffusion pipelines into the new pipeline.py (topology) + vllm_omni/deploy/<model>.yaml (deployment) split. Populates the _DIFFUSION_PIPELINES slot in vllm_omni/config/pipeline_registry.py that was left empty in #2915 explicitly for this PR.

Pipelines added

flux, flux_kontext, flux2, flux2_klein, qwen_image (+ edit / + edit_plus / + layered), z_image, ovis_image, longcat_image (+ edit), sd3, helios, omnigen2, nextstep_1_1.

HeliosPipeline and HeliosPyramidPipeline map to the same class in _DIFFUSION_MODELS — registered once as helios.

Per-model layout

For each entry:

_DIFFUSION_PIPELINES table entry built via _single_stage_diffusion(model_type, model_arch, output) — uniform single-stage DIFFUSION topology, final_output=True, final_output_type="image". No per-model pipeline.py file (see [Followup] Diffusion pipeline shape: when to split DiT/VAE, when to graduate to per-model pipeline.py #3038 for the topology-evolution rationale).
vllm_omni/deploy/<model>.yaml — single stage block with gpu_memory_utilization, devices, default_sampling_params. async_chunk: false.

Autodetect via `diffusers_class_name`

StagePipelineConfig.diffusers_class_name was added by #2977 (GLM-Image), which is now on main. The helper sets diffusers_class_name=model_arch (the diffusers Pipeline class name doubles as the model_arch for these entries), so _auto_detect_model_type resolves pure-diffusers checkpoints via model_index.json without users passing --pipeline <key>.

vllm serve black-forest-labs/FLUX.1-dev --omni now Just Works.

Non-goals

No changes to _DIFFUSION_MODELS registry or diffusion engine wiring.
No legacy stage_configs/ deletion (single-stage diffusion never had legacy yamls — these models relied on the auto-fallback path).
No new schema fields.

Topology follow-up (informational)

#3038 tracks the plan for when DiT/VAE splits into separate stages: add a second _dit_vae_diffusion helper next to the current _single_stage_diffusion, mechanical sweep on the 16 entries. No per-model pipeline.py files needed unless a model's topology truly diverges (refiner stage, custom processor, etc.). Documented so future contributors don't preemptively bloat the registry.

Test plan

pre-commit run --files <changed-files> passes
pytest tests/config/test_pipeline_registry.py -v
CI green (function test, perf test, omni stage tests)
Manual e2e on flux: vllm serve black-forest-labs/FLUX.1-dev --omni + /v1/images/generations

cc @alex-jw-brooks @hsliuustc0106 @JaredforReal

lishunyang12 · 2026-04-21T13:53:38Z

No GPU to test. Awaiting

chatgpt-codex-connector · 2026-04-21T13:54:13Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

lishunyang12 · 2026-04-21T13:54:41Z

@alex-jw-brooks @xiaohajiayou PTAL

alex-jw-brooks

Looks great, thanks! Just a couple questions.

I need to resume the work on it, but we should coordinate on this PR as well once it's further along to make sure default sampling params resolve correctly 🙂

alex-jw-brooks · 2026-04-21T17:47:15Z

+        stages=(
+            StagePipelineConfig(
+                stage_id=0,
+                model_stage="dit",


Just to make sure I understand, the model_stage="dit" here doesn't actually matter, right? I.e., it's just a placeholder and dit isn't a special value

Right, it's a placeholder for single-stage diffusion — only consumed by tools/configure_stage_memory.py as a display column. Convention matches hunyuan_image3_moe_dit_2gpu_fp8.yaml (model_stage: dit) and the bagel docs. Multi-stage models like voxcpm/cosyvoice3 use it for dispatch in their unified model classes; single-stage diffusion has nothing to dispatch.

alex-jw-brooks · 2026-04-21T17:52:13Z

-_VLLM_OMNI_PIPELINES: dict[str, tuple[str, str]] = {
-    **_OMNI_PIPELINES,
-    **_DIFFUSION_PIPELINES,
+def _single_stage_diffusion(model_type: str, model_arch: str, output: str = "image") -> PipelineConfig:


Nit - it may be easy for people to copy _single_stage_diffusion calls from below and miss the output string parameter.

Maybe something like this would be more clear

def _single_stage_diffusion(model_type: str, model_arch: str, output: str) -> PipelineConfig: """Uniform single-stage DIFFUSION topology — every entry in ``_DIFFUSION_PIPELINES`` is built from this one helper. """ return PipelineConfig( model_type=model_type, model_arch=model_arch, stages=( StagePipelineConfig( stage_id=0, model_stage="dit", execution_type=StageExecutionType.DIFFUSION, final_output=True, final_output_type=output, model_arch=model_arch, ), ), ) def _single_stage_image_diffusion(model_type: str, model_arch) -> PipelineConfig: return _single_stage_diffusion(model_type, moidel_arch, output="image") # Could also have _single_stage_audio_diffusion etc as needed later DIFFUSION_PIPELINES: dict[str, PipelineConfig] = { ### Image Diffusion Models "flux": _single_stage_image_diffusion("flux", "FluxPipeline"), "flux_kontext": _single_stage_image_diffusion("flux_kontext", "FluxKontextPipeline"), ### And could also add other sections if it makes sense

Done in 8b886af — dropped the default and made every call site pass "image" explicitly. Cheap safety for when 3b/N adds video/audio diffusion.

alex-jw-brooks · 2026-04-21T17:55:52Z

+  - stage_id: 0
+    gpu_memory_utilization: 0.9
+    devices: "0"
+    enforce_eager: true


Is there a reason enforce_eager is True by default for the model configs?

Conservative carryover that I hadn't actually validated against CUDA graph capture. Removed the line from all 16 yamls in 8b886af — falls through to the dataclass default (vllm_omni/diffusion/data.py: enforce_eager: bool = False), so cudagraph runs by default. Users hitting capture issues can flip per-model.

Signed-off-by: lishunyang <lishunyang12@163.com>

…om a uniform table Refactors PR 3a per review feedback: single-stage diffusion shares one topology, so the per-model pipeline.py files were boilerplate. Replaces them with a single _single_stage_diffusion(...) helper called inline from _DIFFUSION_PIPELINES. Drops 16 pipeline.py + 16 __init__.py files and the heterogeneous _VLLM_OMNI_PIPELINES union; _LazyPipelineRegistry checks the two source tables directly. Signed-off-by: lishunyang <lishunyang12@163.com>

Signed-off-by: lishunyang <lishunyang12@163.com>

lishunyang12 · 2026-04-22T16:14:41Z

Filed #3038 to document the topology-evolution plan (keep _single_stage_diffusion helper, add a second _dit_vae_diffusion helper when VAE splits into its own stage, graduate individual models to per-model pipeline.py only when their topology actually diverges). No changes to this PR — purely forward-looking. cc @alex-jw-brooks @TaffyOfficial

Signed-off-by: lishunyang <lishunyang12@163.com>

lishunyang12 · 2026-04-22T16:35:48Z

Pushed 163ca96: now that #2977 has landed, set diffusers_class_name=model_arch in the helper so all 16 entries get autodetect via model_index.json. vllm serve black-forest-labs/FLUX.1-dev --omni works without --pipeline flux. Updated PR description and test.

…example Signed-off-by: lishunyang <lishunyang12@163.com>

Signed-off-by: lishunyang <lishunyang12@163.com>

hsliuustc0106

Blocker scan clean — no issues found across correctness, reliability, breaking changes, tests, docs, or security.

A few observations (non-blocking):

The _single_stage_diffusion() helper is a clean abstraction for the uniform topology. The explicit output parameter (per alex's feedback) is a good safety net for when video/audio diffusion entries arrive in 3b/N.
_VLLM_OMNI_PIPELINES removal is fully contained — all references are updated in this diff, no external consumers in the codebase.
diffusers_class_name wiring enables vllm serve <hf-repo> --omni without --pipeline for all 16 models. Good DX improvement.
The example script changes (deploy/stage overrides forwarding) are correctly guarded with is not None checks so unset flags don't clobber YAML defaults.

Remaining items from the PR description checklist (acknowledged):

CI green
Manual e2e on flux

LGTM once CI passes and manual e2e confirms.

lishunyang12 · 2026-04-22T17:39:21Z

Closing — these 16 single-stage diffusion models already work without --pipeline <key> via the _create_default_diffusion_stage_cfg fallback (cf. examples/online_serving/image_to_image docs, which use bare vllm serve <repo> --omni). The user-facing delta this PR adds is small (per-model default rescue for ~2-3 of 16 models like Z-Image's guidance_scale=1.0, plus --deploy-config profile switching that mostly duplicates existing CLI flags).

The pipeline + deploy split system genuinely earns its complexity for multistage models (#2989 hunyuan_image3, existing qwen3_omni_moe) — per-stage TP / KV transfer / custom processors. For single-stage, it's mostly bookkeeping over an already-working fallback.

Deferring single-stage migration until DiT/VAE actually splits per #3038. At that point both the registry entries and per-stage yamls land together, with real per-stage config to justify the structure.

cc @alex-jw-brooks @hsliuustc0106

lishunyang12 marked this pull request as ready for review April 21, 2026 13:54

lishunyang12 requested a review from hsliuustc0106 as a code owner April 21, 2026 13:54

alex-jw-brooks approved these changes Apr 21, 2026

View reviewed changes

lishunyang12 added 3 commits April 23, 2026 00:01

[Config Refactor 3a/N] Image diffusion pipeline configs

51f2f6f

Signed-off-by: lishunyang <lishunyang12@163.com>

Address review: drop output default, remove enforce_eager from yamls

09cf372

Signed-off-by: lishunyang <lishunyang12@163.com>

lishunyang12 force-pushed the config-refactor-3a-image-diffusion branch from 8b886af to 09cf372 Compare April 22, 2026 16:01

This was referenced Apr 22, 2026

[Followup] Diffusion pipeline shape: when to split DiT/VAE, when to graduate to per-model pipeline.py #3037

Open

[Followup] Diffusion pipeline shape: when to split DiT/VAE, when to graduate to per-model pipeline.py #3038

Closed

Set diffusers_class_name in helper for autodetect

163ca96

Signed-off-by: lishunyang <lishunyang12@163.com>

lishunyang12 added 2 commits April 23, 2026 01:03

Expose gpu-memory-utilization, stage-overrides, deploy-config in t2i …

296db3b

…example Signed-off-by: lishunyang <lishunyang12@163.com>

Fix t2i override diagnostic to read from stage_configs.engine_args

aaec223

Signed-off-by: lishunyang <lishunyang12@163.com>

hsliuustc0106 reviewed Apr 22, 2026

View reviewed changes

lishunyang12 closed this Apr 22, 2026

lishunyang12 mentioned this pull request Apr 22, 2026

[Cleanup] Drop empty _DIFFUSION_PIPELINES placeholder #3040

Open

fhfuih mentioned this pull request Apr 23, 2026

[feat]: General diffusers adapter backend to run diffusion models #2724

Merged

5 tasks

Conversation

lishunyang12 commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Pipelines added

Per-model layout

Autodetect via diffusers_class_name

Non-goals

Topology follow-up (informational)

Test plan

Uh oh!

lishunyang12 commented Apr 21, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 21, 2026

Uh oh!

lishunyang12 commented Apr 21, 2026

Uh oh!

alex-jw-brooks left a comment

Choose a reason for hiding this comment

Uh oh!

alex-jw-brooks Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

lishunyang12 Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

alex-jw-brooks Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

lishunyang12 Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

alex-jw-brooks Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

lishunyang12 Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

lishunyang12 commented Apr 22, 2026

Uh oh!

lishunyang12 commented Apr 22, 2026

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

lishunyang12 commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lishunyang12 commented Apr 21, 2026 •

edited

Loading

Autodetect via `diffusers_class_name`