[diffusion] Generalize layerwise offload residency mixin to all components by mickqian · Pull Request #24593 · sgl-project/sglang

mickqian · 2026-05-07T08:03:40Z

Summary

Rename the DiT-specific layerwise offload mixin to LayerwiseOffloadableModuleMixin.
Resolve layerwise residency by module capability before falling back to existing component CPU-offload flags.
Add --layerwise-offload-components to select layerwise offload by pipeline component name, with --layerwise-offload-modules accepted as an alias.
Keep --dit-layerwise-offload legacy behavior: when no component is named, only default DiT components are configured; encoder / VAE / bridge / upsampler / vocoder must be selected explicitly.
Disable conflicting component CPU/FSDP offload flags when the same component is explicitly selected for layerwise offload, and keep layer buffers resident to avoid releasing shared buffers such as RoPE caches.

Validation

Remote H200 container /sgl-workspace/sglang at fdf022713606dc2e6262975145d94e7f7d504a0d:

PYTHONPATH=/sgl-workspace/sglang/python python -m pytest python/sglang/multimodal_gen/test/unit/test_layerwise_offload.py python/sglang/multimodal_gen/test/unit/test_server_args.py -> 41 passed, 2 warnings
Z-Image 512x512, 1 step, seed 0, torch_sdpa backend:
- --dit-layerwise-offload true --dit-offload-prefetch-size 0 -> PASS, enabled ['transformer']
- --layerwise-offload-components transformer --dit-offload-prefetch-size 0 -> PASS, enabled ['transformer']
- --layerwise-offload-components transformer text_encoder --dit-offload-prefetch-size 0 -> PASS, enabled ['text_encoder', 'transformer']
- --layerwise-offload-components all --dit-offload-prefetch-size 0 -> PASS, enabled ['text_encoder', 'vae', 'transformer']
- --layerwise-offload-modules transformer --dit-offload-prefetch-size 0 -> PASS, enabled ['transformer']
- --layerwise-offload-components missing_component --dit-offload-prefetch-size 0 -> PASS with warning and no layerwise component
All six generated PNG files have identical sha256: 0f0dba3e7d97aa3be19ef7d6d1cd3ea0e727c322153a8a4f9904089b8e9ee4c1
No local tests were run.

CI States

Latest PR Test: Run #25928755979
Latest PR Test (Extra): ⚠️ Not enabled — add run-ci-extra label to opt in.

gemini-code-assist

Code Review

This pull request refactors the layerwise offload mechanism by renaming OffloadableDiTMixin to LayerwiseOffloadableModuleMixin and centralizing residency strategy logic in component_manager.py. It introduces helper functions like is_layerwise_offloaded_module and should_cpu_offload_component to simplify offload decisions across various model components. Feedback includes suggestions to remove a redundant bool() call, simplify multi-line tuple unpacking for better readability, and remove an unnecessary trailing comma in a tuple assignment.

gemini-code-assist · 2026-05-07T08:06:28Z

+def is_layerwise_offloaded_module(module: torch.nn.Module) -> bool:
+    return (
+        isinstance(module, LayerwiseOffloadableModuleMixin)
+        and bool(module.layerwise_offload_managers)


The bool() call here is redundant. In Python, an empty list is evaluated as False in a boolean context, so you can check for non-emptiness directly. The pythonic way is to use the list itself in the condition.

Suggested change

and bool(module.layerwise_offload_managers)

and module.layerwise_offload_managers

gemini-code-assist · 2026-05-07T08:06:29Z

+        (shift_msa, scale_msa, gate_msa), (
+            shift_mlp,
+            scale_mlp,
+            gate_mlp,
+        ) = temb_mod_params_img


This multi-line tuple unpacking seems unnecessary as the line is not excessively long. It could be simplified to a single line for better readability and to reduce vertical space.

(shift_msa, scale_msa, gate_msa), (shift_mlp, scale_mlp, gate_mlp) = temb_mod_params_img

gemini-code-assist · 2026-05-07T08:06:29Z

-            x_valid_lens,
-            cap_valid_lens,
-        ) = self.patchify_and_embed(
+        (x, cap_feats, x_size, x_valid_lens, cap_valid_lens,) = self.patchify_and_embed(


The trailing comma in this tuple unpacking is unnecessary. While valid syntax, it's typically used to define a single-element tuple. For multi-element tuples, it's unconventional and can be removed for clarity.

Suggested change

(x, cap_feats, x_size, x_valid_lens, cap_valid_lens,) = self.patchify_and_embed(

(x, cap_feats, x_size, x_valid_lens, cap_valid_lens) = self.patchify_and_embed(

…ency-strategy-compat # Conflicts: # python/sglang/multimodal_gen/configs/pipeline_configs/base.py # python/sglang/multimodal_gen/configs/pipeline_configs/model_deployment_config.py # python/sglang/multimodal_gen/configs/pipeline_configs/mova.py # python/sglang/multimodal_gen/configs/pipeline_configs/wan.py # python/sglang/multimodal_gen/runtime/models/dits/qwen_image.py # python/sglang/multimodal_gen/runtime/server_args.py # python/sglang/multimodal_gen/test/unit/test_server_args.py

…ency-strategy-compat

… components (sgl-project#24593)

…ents Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Generalize layerwise offload residency mixin

1aba730

mickqian requested review from BBuf, ping1jing2, yhyang201 and yingluosanqian as code owners May 7, 2026 08:03

github-actions Bot added lora diffusion SGLang Diffusion labels May 7, 2026

gemini-code-assist Bot reviewed May 7, 2026

View reviewed changes

mickqian added 4 commits May 7, 2026 16:10

Apply residency lint formatting

c7bb890

Enable component-wide layerwise offload setup

955f019

Fix layerwise offload formatting

f480946

Add layerwise offload module group selector

93a121e

mickqian requested a review from wisclmy0611 as a code owner May 7, 2026 09:26

github-actions Bot added the documentation Improvements or additions to documentation label May 7, 2026

mickqian added 8 commits May 7, 2026 17:31

Fix layerwise module selector formatting

04a4b88

Clarify layerwise offload module groups

c17c9d8

Select layerwise offload by component name

d7b90be

Handle DTensor weights in layerwise offload

ba9aeff

Skip module to() for DTensor layerwise offload

36fe728

Disable conflicting offload for layerwise components

71a171e

Keep layer buffers resident under layerwise offload

e66ed84

Preserve layerwise offload alias CLI parsing

fdf0227

mickqian changed the title ~~[diffusion] Generalize layerwise offload residency mixin~~ [diffusion] Generalize layerwise offload residency mixin to all components May 8, 2026

mickqian added 3 commits May 8, 2026 20:31

upd

b64a766

upd

4b8050f

mickqian added the run-ci label May 14, 2026

mickqian added 3 commits May 14, 2026 08:42

style: format flux2 modulation unpack

c97a6fc

upd

075ec0b

Merge remote-tracking branch 'origin/main' into codex/component-resid…

b3d6eb5

…ency-strategy-compat

mickqian added 23 commits May 14, 2026 15:06

upd

19f4125

Fix diffusion consistency CLIP device in CI

adf6a05

lint

e093024

Use tiled RealESRGAN under low GPU memory

efd53db

lint

15e522f

Constrain component layerwise replacement on multi-GPU

936efe2

Preserve component CPU offload with multi-GPU layerwise DiT

b65bb87

Keep encoder CPU offload when multi-GPU DiT is resident

12d6c26

Stage encoder loading for layerwise offload

bd7ce96

Format image encoder startup staging

653e441

upd

d1a2be0

Clarify layerwise offload configuration flow

e463679

Name lazy component layerwise condition

5b3913a

Clarify layerwise offload server arg

0300715

Keep layerwise offload fields internal

dc3d505

Use component selection for layerwise offload

2200582

lint

edcbc1e

Move DiT layerwise selection check to server args

461bc08

upd

f7698fe

upd

9cf1b88

Fix layerwise offload unit regressions

1d78f8a

upd

9476f04

Fix encoder layerwise load kwargs hook

5afe306

mickqian merged commit 416fdbb into sgl-project:main May 16, 2026
124 of 132 checks passed

This was referenced May 25, 2026

fix(ci): enforce legacy docs/ gate in Lint workflow zijiexia/sglang#4

Closed

fix(ci): enforce legacy docs/ gate in Lint workflow #26322

Merged

Shunkangz pushed a commit to Shunkangz/sglang that referenced this pull request May 27, 2026

[diffusion] feat: generalize layerwise offload residency mixin to all…

6567c42

… components (sgl-project#24593)

alphabetc1 pushed a commit to alphabetc1/sglang that referenced this pull request Jun 4, 2026

[diffusion] feat: generalize layerwise offload residency mixin to all…

0e55294

… components (sgl-project#24593)

zijiexia mentioned this pull request Jun 4, 2026

docs: sync legacy docs/-only updates into docs_new (Mintlify) #27308

Merged

zijiexia added a commit to zijiexia/sglang that referenced this pull request Jun 4, 2026

docs_new: port sgl-project#24593 — diffusion layerwise offload compon…

899bc2d

…ents Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[diffusion] Generalize layerwise offload residency mixin to all components#24593

[diffusion] Generalize layerwise offload residency mixin to all components#24593
mickqian merged 43 commits into
sgl-project:mainfrom
mickqian:codex/component-residency-strategy-compat

mickqian commented May 7, 2026 •

edited by github-actions Bot

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 7, 2026

Uh oh!

gemini-code-assist Bot May 7, 2026

Uh oh!

gemini-code-assist Bot May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	and bool(module.layerwise_offload_managers)
	and module.layerwise_offload_managers

	(x, cap_feats, x_size, x_valid_lens, cap_valid_lens,) = self.patchify_and_embed(
	(x, cap_feats, x_size, x_valid_lens, cap_valid_lens) = self.patchify_and_embed(

Conversation

mickqian commented May 7, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

CI States

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mickqian commented May 7, 2026 •

edited by github-actions Bot

Loading