Skip to content

[Config Refactor 3a/N] Image diffusion pipeline configs#2987

Closed
lishunyang12 wants to merge 6 commits intovllm-project:mainfrom
lishunyang12:config-refactor-3a-image-diffusion
Closed

[Config Refactor 3a/N] Image diffusion pipeline configs#2987
lishunyang12 wants to merge 6 commits intovllm-project:mainfrom
lishunyang12:config-refactor-3a-image-diffusion

Conversation

@lishunyang12
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 commented Apr 21, 2026

Summary

Continuation of RFC #2072 and follow-up to #2383 (2/N) and #2915 (2.5/N).

Migrates 16 single-stage image diffusion pipelines into the new pipeline.py (topology) + vllm_omni/deploy/<model>.yaml (deployment) split. Populates the _DIFFUSION_PIPELINES slot in vllm_omni/config/pipeline_registry.py that was left empty in #2915 explicitly for this PR.

Pipelines added

flux, flux_kontext, flux2, flux2_klein, qwen_image (+ edit / + edit_plus / + layered), z_image, ovis_image, longcat_image (+ edit), sd3, helios, omnigen2, nextstep_1_1.

HeliosPipeline and HeliosPyramidPipeline map to the same class in _DIFFUSION_MODELS — registered once as helios.

Per-model layout

For each entry:

Autodetect via diffusers_class_name

StagePipelineConfig.diffusers_class_name was added by #2977 (GLM-Image), which is now on main. The helper sets diffusers_class_name=model_arch (the diffusers Pipeline class name doubles as the model_arch for these entries), so _auto_detect_model_type resolves pure-diffusers checkpoints via model_index.json without users passing --pipeline <key>.

vllm serve black-forest-labs/FLUX.1-dev --omni now Just Works.

Non-goals

  • No changes to _DIFFUSION_MODELS registry or diffusion engine wiring.
  • No legacy stage_configs/ deletion (single-stage diffusion never had legacy yamls — these models relied on the auto-fallback path).
  • No new schema fields.

Topology follow-up (informational)

#3038 tracks the plan for when DiT/VAE splits into separate stages: add a second _dit_vae_diffusion helper next to the current _single_stage_diffusion, mechanical sweep on the 16 entries. No per-model pipeline.py files needed unless a model's topology truly diverges (refiner stage, custom processor, etc.). Documented so future contributors don't preemptively bloat the registry.

Test plan

  • pre-commit run --files <changed-files> passes
  • pytest tests/config/test_pipeline_registry.py -v
  • CI green (function test, perf test, omni stage tests)
  • Manual e2e on flux: vllm serve black-forest-labs/FLUX.1-dev --omni + /v1/images/generations

cc @alex-jw-brooks @hsliuustc0106 @JaredforReal

@lishunyang12
Copy link
Copy Markdown
Collaborator Author

No GPU to test. Awaiting

@lishunyang12 lishunyang12 marked this pull request as ready for review April 21, 2026 13:54
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@lishunyang12
Copy link
Copy Markdown
Collaborator Author

@alex-jw-brooks @xiaohajiayou PTAL

Copy link
Copy Markdown
Contributor

@alex-jw-brooks alex-jw-brooks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks! Just a couple questions.

I need to resume the work on it, but we should coordinate on this PR as well once it's further along to make sure default sampling params resolve correctly 🙂

stages=(
StagePipelineConfig(
stage_id=0,
model_stage="dit",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand, the model_stage="dit" here doesn't actually matter, right? I.e., it's just a placeholder and dit isn't a special value

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it's a placeholder for single-stage diffusion — only consumed by tools/configure_stage_memory.py as a display column. Convention matches hunyuan_image3_moe_dit_2gpu_fp8.yaml (model_stage: dit) and the bagel docs. Multi-stage models like voxcpm/cosyvoice3 use it for dispatch in their unified model classes; single-stage diffusion has nothing to dispatch.

Comment thread vllm_omni/config/pipeline_registry.py Outdated
_VLLM_OMNI_PIPELINES: dict[str, tuple[str, str]] = {
**_OMNI_PIPELINES,
**_DIFFUSION_PIPELINES,
def _single_stage_diffusion(model_type: str, model_arch: str, output: str = "image") -> PipelineConfig:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - it may be easy for people to copy _single_stage_diffusion calls from below and miss the output string parameter.

Maybe something like this would be more clear

def _single_stage_diffusion(model_type: str, model_arch: str, output: str) -> PipelineConfig:
    """Uniform single-stage DIFFUSION topology — every entry in
    ``_DIFFUSION_PIPELINES`` is built from this one helper.
    """
    return PipelineConfig(
        model_type=model_type,
        model_arch=model_arch,
        stages=(
            StagePipelineConfig(
                stage_id=0,
                model_stage="dit", 
                execution_type=StageExecutionType.DIFFUSION,
                final_output=True,
                final_output_type=output,
                model_arch=model_arch,
            ),
        ),
    )

def _single_stage_image_diffusion(model_type: str, model_arch) -> PipelineConfig:
    return _single_stage_diffusion(model_type, moidel_arch, output="image")

# Could also have _single_stage_audio_diffusion etc as needed later

DIFFUSION_PIPELINES: dict[str, PipelineConfig] = {
    ### Image Diffusion Models
    "flux": _single_stage_image_diffusion("flux", "FluxPipeline"),
    "flux_kontext": _single_stage_image_diffusion("flux_kontext", "FluxKontextPipeline"),
    ### And could also add other sections if it makes sense

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 8b886af — dropped the default and made every call site pass "image" explicitly. Cheap safety for when 3b/N adds video/audio diffusion.

Comment thread vllm_omni/deploy/flux.yaml Outdated
- stage_id: 0
gpu_memory_utilization: 0.9
devices: "0"
enforce_eager: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason enforce_eager is True by default for the model configs?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conservative carryover that I hadn't actually validated against CUDA graph capture. Removed the line from all 16 yamls in 8b886af — falls through to the dataclass default (vllm_omni/diffusion/data.py: enforce_eager: bool = False), so cudagraph runs by default. Users hitting capture issues can flip per-model.

Signed-off-by: lishunyang <lishunyang12@163.com>
…om a uniform table

Refactors PR 3a per review feedback: single-stage diffusion shares one
topology, so the per-model pipeline.py files were boilerplate. Replaces
them with a single _single_stage_diffusion(...) helper called inline
from _DIFFUSION_PIPELINES. Drops 16 pipeline.py + 16 __init__.py files
and the heterogeneous _VLLM_OMNI_PIPELINES union; _LazyPipelineRegistry
checks the two source tables directly.

Signed-off-by: lishunyang <lishunyang12@163.com>
Signed-off-by: lishunyang <lishunyang12@163.com>
@lishunyang12
Copy link
Copy Markdown
Collaborator Author

Filed #3038 to document the topology-evolution plan (keep _single_stage_diffusion helper, add a second _dit_vae_diffusion helper when VAE splits into its own stage, graduate individual models to per-model pipeline.py only when their topology actually diverges). No changes to this PR — purely forward-looking. cc @alex-jw-brooks @TaffyOfficial

Signed-off-by: lishunyang <lishunyang12@163.com>
@lishunyang12
Copy link
Copy Markdown
Collaborator Author

Pushed 163ca96: now that #2977 has landed, set diffusers_class_name=model_arch in the helper so all 16 entries get autodetect via model_index.json. vllm serve black-forest-labs/FLUX.1-dev --omni works without --pipeline flux. Updated PR description and test.

…example

Signed-off-by: lishunyang <lishunyang12@163.com>
Signed-off-by: lishunyang <lishunyang12@163.com>
Copy link
Copy Markdown
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker scan clean — no issues found across correctness, reliability, breaking changes, tests, docs, or security.

A few observations (non-blocking):

  • The _single_stage_diffusion() helper is a clean abstraction for the uniform topology. The explicit output parameter (per alex's feedback) is a good safety net for when video/audio diffusion entries arrive in 3b/N.
  • _VLLM_OMNI_PIPELINES removal is fully contained — all references are updated in this diff, no external consumers in the codebase.
  • diffusers_class_name wiring enables vllm serve <hf-repo> --omni without --pipeline for all 16 models. Good DX improvement.
  • The example script changes (deploy/stage overrides forwarding) are correctly guarded with is not None checks so unset flags don't clobber YAML defaults.

Remaining items from the PR description checklist (acknowledged):

  • CI green
  • Manual e2e on flux

LGTM once CI passes and manual e2e confirms.

@lishunyang12
Copy link
Copy Markdown
Collaborator Author

Closing — these 16 single-stage diffusion models already work without --pipeline <key> via the _create_default_diffusion_stage_cfg fallback (cf. examples/online_serving/image_to_image docs, which use bare vllm serve <repo> --omni). The user-facing delta this PR adds is small (per-model default rescue for ~2-3 of 16 models like Z-Image's guidance_scale=1.0, plus --deploy-config profile switching that mostly duplicates existing CLI flags).

The pipeline + deploy split system genuinely earns its complexity for multistage models (#2989 hunyuan_image3, existing qwen3_omni_moe) — per-stage TP / KV transfer / custom processors. For single-stage, it's mostly bookkeeping over an already-working fallback.

Deferring single-stage migration until DiT/VAE actually splits per #3038. At that point both the registry entries and per-stage yamls land together, with real per-stage config to justify the structure.

cc @alex-jw-brooks @hsliuustc0106

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants