Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 8 additions & 4 deletions docs/user_guide/diffusion/cpu_offload_diffusion.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,11 +139,15 @@ Factory function `get_offload_backend()` selects the appropriate backend based o

## Supported Models

| Architecture | Example Models | DiT Class | Model-Level Offload | Layerwise Offload | Blocks Attr (Layerwise specific) |
|--------------|----------------|-----------|---------------------|-------------------|-------------|
| Wan22Pipeline | `Wan-AI/Wan2.2-T2V-A14B-Diffusers` | `WanTransformer3DModel` | ✓ | ✓ | `"blocks"` |
| Wan22I2VPipeline | `Wan-AI/Wan2.2-I2V-A14B-Diffusers` | `WanTransformer3DModel` | ✓ | ✓ | `"blocks"` |
| Architecture | Example Models | DiT Class | Model-Level Offload | Layerwise Offload | Blocks Attrs (Layerwise specific) |
|--------------|----------------|-----------|---------------------|-------------------|-----------------------------------|
| LongCatImagePipeline | `meituan-longcat/LongCat-Image` | `LongCatImageTransformer2DModel` | - | ✓ | `"transformer_blocks"`, `"single_transformer_blocks"` |
| NextStep11Pipeline | `stepfun-ai/NextStep-1.1` | `NextStepModel` | - | ✓ | `"layers"` |
| OvisImagePipeline | `AIDC-AI/Ovis-Image-7B` | `OvisImageTransformer2DModel` | - | ✓ | `"transformer"` |
| QwenImagePipeline | `Qwen/Qwen-Image` | `QwenImageTransformer2DModel` | ✓ | ✓ | `"transformer_blocks"` |
| StableDiffusion3Pipeline | `stabilityai/stable-diffusion-3.5-medium` | `SD3Transformer2DModel` | - | ✓ | `"transformer_blocks"` |
| Wan22I2VPipeline | `Wan-AI/Wan2.2-I2V-A14B-Diffusers` | `WanTransformer3DModel` | ✓ | ✓ | `"blocks"` |
| Wan22Pipeline | `Wan-AI/Wan2.2-T2V-A14B-Diffusers` | `WanTransformer3DModel` | ✓ | ✓ | `"blocks"` |

**Notes:**
- Model-Level Offloading is expected to be supported by all common diffusion models (DiT and encoders) naturally
Expand Down
10 changes: 5 additions & 5 deletions docs/user_guide/diffusion_features.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,19 +107,19 @@ The following tables show which models support each feature:
| **FLUX.2-dev** | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| **GLM-Image** | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| **HunyuanImage3** | ❌ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ |
| **LongCat-Image** | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | | ❌ | ❌ | ❌ |
| **LongCat-Image-Edit** | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | | ❌ | ❌ | ❌ |
| **LongCat-Image** | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | | ❌ | ❌ | ❌ |
| **LongCat-Image-Edit** | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | | ❌ | ❌ | ❌ |
| **MagiHuman** | ❌ | ❌ | ❌ | ❓ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ |
| **MammothModa2(T2I)** | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
| **Nextstep_1(T2I)** | ❓ | ❓ | ❌ | ✅ | ✅ | ❌ | | ❌ | ❌ | ❌ |
| **Nextstep_1(T2I)** | ❓ | ❓ | ❌ | ✅ | ✅ | ❌ | | ❌ | ❌ | ❌ |
| **OmniGen2** | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
| **Ovis-Image** | ❌ | ✅ | ❌ | ✅ | ❌ | ❌ | | ❌ | ❌ | ❌ |
| **Ovis-Image** | ❌ | ✅ | ❌ | ✅ | ❌ | ❌ | | ❌ | ❌ | ❌ |
| **Qwen-Image** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (decode) | ✅ | ✅ |
| **Qwen-Image-2512** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (decode) | ✅ | ✅ |
| **Qwen-Image-Edit** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (decode) | ❌ | ❌ |
| **Qwen-Image-Edit-2509** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (decode) | ✅ | ❌ | ❌ |
| **Qwen-Image-Layered** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (decode) | ❌ | ❌ |
| **Stable-Diffusion3.5** | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | | ✅ (decode) | ❌ | ❌ |
| **Stable-Diffusion3.5** | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | | ✅ (decode) | ❌ | ❌ |
| **Z-Image** | ✅ | ✅ | ✅ | ❓ | ✅ (TP=2 only) | ✅ | ❌ | ✅ (decode) | ✅ | ❌ |

> Notes:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -376,7 +376,7 @@ def main():
f"vae_patch_parallel_size={args.vae_patch_parallel_size}, "
f"enable_expert_parallel={args.enable_expert_parallel}."
)
print(f" CPU offload: {args.enable_cpu_offload}")
print(f" CPU offload: {args.enable_cpu_offload}; CPU Layerwise Offload: {args.enable_layerwise_offload}")
print(f" Image size: {args.width}x{args.height}")
if args.lora_path:
print(f" LoRA: scale={args.lora_scale}")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -582,6 +582,7 @@ class LongCatImageTransformer2DModel(nn.Module):
"""

_repeated_blocks = ["LongCatImageTransformerBlock", "LongCatImageSingleTransformerBlock"]
_layerwise_offload_blocks_attrs = ["transformer_blocks", "single_transformer_blocks"]

# Sequence Parallelism for LongCat (following diffusers' _cp_plan pattern)
_sp_plan = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,8 @@ def from_json(cls, path: str) -> NextStepConfig:


class NextStepModel(nn.Module):
_layerwise_offload_blocks_attrs = ["layers"]

def __init__(self, config: NextStepConfig):
super().__init__()
self.config = config
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -366,6 +366,7 @@ class OvisImageTransformer2DModel(nn.Module):
"""

_repeated_blocks = ["OvisImageTransformerBlock", "OvisImageSingleTransformerBlock"]
_layerwise_offload_blocks_attrs = ["transformer_blocks", "single_transformer_blocks"]

def __init__(
self,
Expand Down
1 change: 1 addition & 0 deletions vllm_omni/diffusion/models/sd3/sd3_transformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -387,6 +387,7 @@ class SD3Transformer2DModel(nn.Module):
"""

_repeated_blocks = ["SD3TransformerBlock"]
_layerwise_offload_blocks_attrs = ["transformer_blocks"]

def __init__(
self,
Expand Down
2 changes: 1 addition & 1 deletion vllm_omni/diffusion/offloader/module_collector.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ class PipelineModules:
class ModuleDiscovery:
"""Discovers pipeline components for offloading"""

DIT_ATTRS = ["transformer", "transformer_2", "dit", "sr_dit", "language_model", "transformer_blocks"]
DIT_ATTRS = ["transformer", "transformer_2", "dit", "sr_dit", "language_model", "transformer_blocks", "model"]
ENCODER_ATTRS = ["text_encoder", "text_encoder_2", "text_encoder_3", "image_encoder"]
VAE_ATTRS = ["vae", "audio_vae"]

Expand Down
Loading