[Config Refactor 3a/N] Image diffusion pipeline configs#2987
[Config Refactor 3a/N] Image diffusion pipeline configs#2987lishunyang12 wants to merge 6 commits intovllm-project:mainfrom
Conversation
|
No GPU to test. Awaiting |
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
alex-jw-brooks
left a comment
There was a problem hiding this comment.
Looks great, thanks! Just a couple questions.
I need to resume the work on it, but we should coordinate on this PR as well once it's further along to make sure default sampling params resolve correctly 🙂
| stages=( | ||
| StagePipelineConfig( | ||
| stage_id=0, | ||
| model_stage="dit", |
There was a problem hiding this comment.
Just to make sure I understand, the model_stage="dit" here doesn't actually matter, right? I.e., it's just a placeholder and dit isn't a special value
There was a problem hiding this comment.
Right, it's a placeholder for single-stage diffusion — only consumed by tools/configure_stage_memory.py as a display column. Convention matches hunyuan_image3_moe_dit_2gpu_fp8.yaml (model_stage: dit) and the bagel docs. Multi-stage models like voxcpm/cosyvoice3 use it for dispatch in their unified model classes; single-stage diffusion has nothing to dispatch.
| _VLLM_OMNI_PIPELINES: dict[str, tuple[str, str]] = { | ||
| **_OMNI_PIPELINES, | ||
| **_DIFFUSION_PIPELINES, | ||
| def _single_stage_diffusion(model_type: str, model_arch: str, output: str = "image") -> PipelineConfig: |
There was a problem hiding this comment.
Nit - it may be easy for people to copy _single_stage_diffusion calls from below and miss the output string parameter.
Maybe something like this would be more clear
def _single_stage_diffusion(model_type: str, model_arch: str, output: str) -> PipelineConfig:
"""Uniform single-stage DIFFUSION topology — every entry in
``_DIFFUSION_PIPELINES`` is built from this one helper.
"""
return PipelineConfig(
model_type=model_type,
model_arch=model_arch,
stages=(
StagePipelineConfig(
stage_id=0,
model_stage="dit",
execution_type=StageExecutionType.DIFFUSION,
final_output=True,
final_output_type=output,
model_arch=model_arch,
),
),
)
def _single_stage_image_diffusion(model_type: str, model_arch) -> PipelineConfig:
return _single_stage_diffusion(model_type, moidel_arch, output="image")
# Could also have _single_stage_audio_diffusion etc as needed later
DIFFUSION_PIPELINES: dict[str, PipelineConfig] = {
### Image Diffusion Models
"flux": _single_stage_image_diffusion("flux", "FluxPipeline"),
"flux_kontext": _single_stage_image_diffusion("flux_kontext", "FluxKontextPipeline"),
### And could also add other sections if it makes sense
There was a problem hiding this comment.
Done in 8b886af — dropped the default and made every call site pass "image" explicitly. Cheap safety for when 3b/N adds video/audio diffusion.
| - stage_id: 0 | ||
| gpu_memory_utilization: 0.9 | ||
| devices: "0" | ||
| enforce_eager: true |
There was a problem hiding this comment.
Is there a reason enforce_eager is True by default for the model configs?
There was a problem hiding this comment.
Conservative carryover that I hadn't actually validated against CUDA graph capture. Removed the line from all 16 yamls in 8b886af — falls through to the dataclass default (vllm_omni/diffusion/data.py: enforce_eager: bool = False), so cudagraph runs by default. Users hitting capture issues can flip per-model.
Signed-off-by: lishunyang <lishunyang12@163.com>
…om a uniform table Refactors PR 3a per review feedback: single-stage diffusion shares one topology, so the per-model pipeline.py files were boilerplate. Replaces them with a single _single_stage_diffusion(...) helper called inline from _DIFFUSION_PIPELINES. Drops 16 pipeline.py + 16 __init__.py files and the heterogeneous _VLLM_OMNI_PIPELINES union; _LazyPipelineRegistry checks the two source tables directly. Signed-off-by: lishunyang <lishunyang12@163.com>
Signed-off-by: lishunyang <lishunyang12@163.com>
8b886af to
09cf372
Compare
|
Filed #3038 to document the topology-evolution plan (keep |
Signed-off-by: lishunyang <lishunyang12@163.com>
…example Signed-off-by: lishunyang <lishunyang12@163.com>
Signed-off-by: lishunyang <lishunyang12@163.com>
hsliuustc0106
left a comment
There was a problem hiding this comment.
Blocker scan clean — no issues found across correctness, reliability, breaking changes, tests, docs, or security.
A few observations (non-blocking):
- The
_single_stage_diffusion()helper is a clean abstraction for the uniform topology. The explicitoutputparameter (per alex's feedback) is a good safety net for when video/audio diffusion entries arrive in 3b/N. _VLLM_OMNI_PIPELINESremoval is fully contained — all references are updated in this diff, no external consumers in the codebase.diffusers_class_namewiring enablesvllm serve <hf-repo> --omniwithout--pipelinefor all 16 models. Good DX improvement.- The example script changes (deploy/stage overrides forwarding) are correctly guarded with
is not Nonechecks so unset flags don't clobber YAML defaults.
Remaining items from the PR description checklist (acknowledged):
- CI green
- Manual e2e on flux
LGTM once CI passes and manual e2e confirms.
|
Closing — these 16 single-stage diffusion models already work without The pipeline + deploy split system genuinely earns its complexity for multistage models (#2989 hunyuan_image3, existing qwen3_omni_moe) — per-stage TP / KV transfer / custom processors. For single-stage, it's mostly bookkeeping over an already-working fallback. Deferring single-stage migration until DiT/VAE actually splits per #3038. At that point both the registry entries and per-stage yamls land together, with real per-stage config to justify the structure. |
Summary
Continuation of RFC #2072 and follow-up to #2383 (2/N) and #2915 (2.5/N).
Migrates 16 single-stage image diffusion pipelines into the new
pipeline.py(topology) +vllm_omni/deploy/<model>.yaml(deployment) split. Populates the_DIFFUSION_PIPELINESslot invllm_omni/config/pipeline_registry.pythat was left empty in #2915 explicitly for this PR.Pipelines added
flux, flux_kontext, flux2, flux2_klein, qwen_image (+ edit / + edit_plus / + layered), z_image, ovis_image, longcat_image (+ edit), sd3, helios, omnigen2, nextstep_1_1.
HeliosPipelineandHeliosPyramidPipelinemap to the same class in_DIFFUSION_MODELS— registered once ashelios.Per-model layout
For each entry:
_DIFFUSION_PIPELINEStable entry built via_single_stage_diffusion(model_type, model_arch, output)— uniform single-stageDIFFUSIONtopology,final_output=True,final_output_type="image". No per-modelpipeline.pyfile (see [Followup] Diffusion pipeline shape: when to split DiT/VAE, when to graduate to per-model pipeline.py #3038 for the topology-evolution rationale).vllm_omni/deploy/<model>.yaml— single stage block withgpu_memory_utilization,devices,default_sampling_params.async_chunk: false.Autodetect via
diffusers_class_nameStagePipelineConfig.diffusers_class_namewas added by #2977 (GLM-Image), which is now on main. The helper setsdiffusers_class_name=model_arch(the diffusers Pipeline class name doubles as the model_arch for these entries), so_auto_detect_model_typeresolves pure-diffusers checkpoints viamodel_index.jsonwithout users passing--pipeline <key>.vllm serve black-forest-labs/FLUX.1-dev --omninow Just Works.Non-goals
_DIFFUSION_MODELSregistry or diffusion engine wiring.stage_configs/deletion (single-stage diffusion never had legacy yamls — these models relied on the auto-fallback path).Topology follow-up (informational)
#3038 tracks the plan for when DiT/VAE splits into separate stages: add a second
_dit_vae_diffusionhelper next to the current_single_stage_diffusion, mechanical sweep on the 16 entries. No per-modelpipeline.pyfiles needed unless a model's topology truly diverges (refiner stage, custom processor, etc.). Documented so future contributors don't preemptively bloat the registry.Test plan
pre-commit run --files <changed-files>passespytest tests/config/test_pipeline_registry.py -vvllm serve black-forest-labs/FLUX.1-dev --omni+/v1/images/generationscc @alex-jw-brooks @hsliuustc0106 @JaredforReal