[Feat] cache-dit for GLM-Image by RuixiangMa · Pull Request #1399 · vllm-project/vllm-omni

RuixiangMa · 2026-02-18T12:26:45Z

Purpose

support cache-dit for GLM-Image

Test Plan

Test Result

curl -X POST http://localhost:8004/v1/images/generations   -H "Content-Type: application/json"   -d '{
    "prompt": "A beautifully designed modern food magazine style dessert recipe illustration, themed around a raspberry mousse cake. The overall layout is clean and bright, divided into four main areas: the top left features a bold black title '\''Raspberry Mousse Cake Recipe Guide'\'', with a soft-lit close-up photo of the finished cake on the right, showcasing a light pink cake adorned with fresh raspberries and mint leaves; the bottom left contains an ingredient list section, titled '\''Ingredients'\'' in a simple font, listing '\''Flour 150g'\'', '\''Eggs 3'\'', '\''Sugar 120g'\'', '\''Raspberry puree 200g'\'', '\''Gelatin sheets 10g'\'', '\''Whipping cream 300ml'\'', and '\''Fresh raspberries'\'', each accompanied by minimalist line icons (like a flour bag, eggs, sugar jar, etc.); the bottom right displays four equally sized step boxes, each containing high-definition macro photos and corresponding instructions, arranged from top to bottom as follows: Step 1 shows a whisk whipping white foam (with the instruction '\''Whip egg whites to stiff peaks'\''), Step 2 shows a red-and-white mixture being folded with a spatula (with the instruction '\''Gently fold in the puree and batter'\''), Step 3 shows pink liquid being poured into a round mold (with the instruction '\''Pour into mold and chill for 4 hours'\''), Step 4 shows the finished cake decorated with raspberries and mint leaves (with the instruction '\''Decorate with raspberries and mint'\''); a light brown information bar runs along the bottom edge, with icons on the left representing '\''Preparation time: 30 minutes'\'', '\''Cooking time: 20 minutes'\'', and '\''Servings: 8'\''. The overall color scheme is dominated by creamy white and light pink, with a subtle paper texture in the background, featuring compact and orderly text and image layout with clear information hierarchy.",
    "height": 1024,
    "width": 1024,
    "num_inference_steps": 50,
    "guidance_scale": 1.5,
    "seed": 42
  }' | jq -r '.data[0].b64_json' | base64 -d > recipe.png

Metric	NO cache-dit	cache-dit
Image
Time	90 s/img	54 s/img

lishunyang12

Nice work — the ~40% speedup is solid and output quality looks well preserved.

hsliuustc0106 · 2026-02-24T08:02:15Z

@vllm-omni-reviewer

github-actions · 2026-02-24T08:06:39Z

🤖 VLLM-Omni PR Review

Code Review

1. Overview

This PR introduces support for cache-dit (Caching Diffusion Transformers) for the GlmImagePipeline. It adds a new function enable_cache_for_glm_image which configures the caching mechanism specifically for the GLM-Image model architecture and registers it in the custom enablers dictionary.

The changes are focused and follow the established patterns in the codebase. The test results demonstrate a significant performance improvement (approximately 40% speedup: 90s/img -> 54s/img) with visually consistent output quality.

Assessment: Positive. The implementation is clean, well-documented, and consistent with existing code patterns.

2. Code Quality

Code Style: The code adheres to the existing style guide (docstrings, logging, variable naming conventions).
Consistency: The implementation mirrors the structure of other enabler functions (e.g., enable_cache_for_sd3, enable_cache_for_bagel), ensuring consistency across the codebase.
Comments: The inline comment regarding patch_functor and ForwardPattern is excellent. It explains why a specific configuration was chosen relative to the standard diffusers implementation, which is crucial for future maintenance.
Potential Bugs: No obvious bugs detected. The logic handles configuration building, TaylorSeer calibration (optional), and context refreshing correctly.

3. Architecture & Design

Integration: The PR correctly utilizes the plugin-style architecture by updating the CUSTOM_DIT_ENABLERS dictionary. This decouples the specific model logic from the core pipeline execution logic.
Design Patterns: Uses the Factory pattern (via the dictionary registration) and closures (returning refresh_cache_context).
Adapter Configuration:
- ForwardPattern.Pattern_0: Correctly identified as necessary because the vLLM-Omni implementation returns (hidden_states, encoder_hidden_states).
- has_separate_cfg=True: This suggests the pipeline handles Classifier-Free Guidance in a specific manner. This flag is critical for cache correctness; assuming this matches the GlmImagePipeline implementation in vLLM-Omni, this is correct.

4. Security & Safety

Resource Management: Caching mechanisms inherently trade memory for computation speed. This implementation relies on DBCacheConfig to manage these resources. There are no obvious memory leaks introduced in the Python logic.
Input Validation: The function relies on cache_config being a valid DiffusionCacheConfig object, which is consistent with other enablers.

5. Testing & Documentation

Test Plan: The PR provides a clear curl command for reproducibility.
Results: The comparison table effectively demonstrates the value of the feature (speed increase) and validates that image quality is preserved.
Documentation: The docstrings are clear and explain the arguments and the returned closure.

6. Specific Suggestions

vllm_omni/diffusion/cache/cache_dit_backend.py:899 (ForwardPattern)
The comment states: "We use ForwardPattern.Pattern_0 because our block returns (hidden_states, encoder_hidden_states)".

Suggestion: Ensure this pattern strictly matches the signature of pipeline.transformer.transformer_blocks[i].forward. If the vLLM-Omni implementation changes in the future to match standard diffusers (returning only hidden_states), this would need updating. The current implementation seems correct based on your comment.

vllm_omni/diffusion/cache/cache_dit_backend.py:901 (has_separate_cfg)

Suggestion: Double-check that GlmImagePipeline in vLLM-Omni actually runs Conditional and Unconditional passes separately (or if this flag handles the specific CFG caching logic required). If the pipeline fuses CFG (batching cond/uncond), this flag might need to be False. Given the performance gains, it appears configured correctly, but this is a high-risk parameter if wrong.

vllm_omni/diffusion/cache/cache_dit_backend.py:936 (Registration)

Nitpick: The key "GlmImagePipeline" must exactly match the class name of the pipeline instance passed to the backend. This looks correct.

7. Approval Status

LGTM with suggestions

The implementation is solid, follows project conventions, and provides significant performance benefits. The suggestions above are primarily verification points regarding the specific ForwardPattern and CFG handling, which rely on the internal details of the GlmImage implementation in vLLM-Omni.

Action: If the internal pipeline signature matches the assumptions in the code (Pattern_0 and separate CFG), this is ready to merge.

This review was generated automatically by the VLLM-Omni PR Reviewer Bot
using glm-5.

hsliuustc0106 · 2026-03-12T12:49:14Z

Hi @RuixiangMa 👋

This PR hasn't been updated for 16 days. We're tracking stale PRs for the next release. Could you share the current status? Is there anything blocking progress?

Thanks!

RuixiangMa · 2026-03-12T12:59:22Z

Hi @RuixiangMa 👋

This PR hasn't been updated for 16 days. We're tracking stale PRs for the next release. Could you share the current status? Is there anything blocking

No，it work well for me，awaiting review and merge

SamitHuang · 2026-03-13T06:55:29Z

@RuixiangMa can you fix the conflicts at first? we can merge it and then upgrade to cache-dit 1.3.0 in #1858

Signed-off-by: Lancer <maruixiang6688@gmail.com>

RuixiangMa · 2026-03-13T09:36:18Z

@RuixiangMa can you fix the conflicts at first? we can merge it and then upgrade to cache-dit 1.3.0 in #1858

fixed

SamitHuang

LGTM

hsliuustc0106 · 2026-04-16T04:19:57Z

fix precommits please, we expect this to be merged asap

Signed-off-by: Lancer <maruixiang6688@gmail.com>

Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: Samit <285365963@qq.com>

RuixiangMa requested a review from hsliuustc0106 as a code owner February 18, 2026 12:26

RuixiangMa marked this pull request as draft February 18, 2026 12:26

RuixiangMa force-pushed the glmforcachedit branch from 5aac425 to f962f72 Compare February 19, 2026 08:51

lishunyang12 approved these changes Feb 21, 2026

View reviewed changes

RuixiangMa marked this pull request as ready for review February 24, 2026 07:53

wtomin mentioned this pull request Mar 12, 2026

[RFC]: Continuous Diffusion Model Acceleration Support #1217

Open

1 task

wtomin mentioned this pull request Mar 13, 2026

[Enhancement] Add force_refresh support for GLM-Image for cache-dit 1.3.0 upgrade #1858

Merged

2 tasks

[Feat] cache-dit for GLM-Image

b84b395

Signed-off-by: Lancer <maruixiang6688@gmail.com>

RuixiangMa force-pushed the glmforcachedit branch from f962f72 to b84b395 Compare March 13, 2026 09:35

SamitHuang approved these changes Mar 13, 2026

View reviewed changes

SamitHuang enabled auto-merge (squash) March 13, 2026 10:22

Gaohan123 added this to the v0.18.0 milestone Mar 17, 2026

SamitHuang and others added 3 commits March 20, 2026 23:00

Merge branch 'main' into glmforcachedit

3ba7642

Merge branch 'main' into glmforcachedit

6c6bfba

Merge branch 'main' into glmforcachedit

ea55a1e

Gaohan123 modified the milestones: v0.18.0, v0.20.0 Apr 14, 2026

RuixiangMa added 2 commits April 17, 2026 10:18

Merge branch 'main' into glmforcachedit

b307900

upd

ba0d754

Signed-off-by: Lancer <maruixiang6688@gmail.com>

auto-merge was automatically disabled April 17, 2026 04:34
Head branch was pushed to by a user without write access

hsliuustc0106 added the ready label to trigger buildkite CI label Apr 17, 2026

Merge branch 'main' into glmforcachedit

7f80cad

hsliuustc0106 merged commit 9cf1fe7 into vllm-project:main Apr 18, 2026
8 checks passed

lvliang-intel pushed a commit to lvliang-intel/vllm-omni that referenced this pull request Apr 20, 2026

[Feat] cache-dit for GLM-Image (vllm-project#1399)

0d79297

Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: Samit <285365963@qq.com>

qinganrice pushed a commit to qinganrice/vllm-omni that referenced this pull request Apr 23, 2026

[Feat] cache-dit for GLM-Image (vllm-project#1399)

3d872e0

Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: Samit <285365963@qq.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] cache-dit for GLM-Image#1399

[Feat] cache-dit for GLM-Image#1399
hsliuustc0106 merged 7 commits intovllm-project:mainfrom
RuixiangMa:glmforcachedit

RuixiangMa commented Feb 18, 2026 •

edited

Loading

Uh oh!

lishunyang12 left a comment •

edited

Loading

Uh oh!

hsliuustc0106 commented Feb 24, 2026

Uh oh!

github-actions Bot commented Feb 24, 2026

Uh oh!

hsliuustc0106 commented Mar 12, 2026

Uh oh!

RuixiangMa commented Mar 12, 2026

Uh oh!

SamitHuang commented Mar 13, 2026

Uh oh!

RuixiangMa commented Mar 13, 2026

Uh oh!

SamitHuang left a comment

Uh oh!

hsliuustc0106 commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

RuixiangMa commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

lishunyang12 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 commented Feb 24, 2026

Uh oh!

github-actions Bot commented Feb 24, 2026

🤖 VLLM-Omni PR Review

Code Review

1. Overview

2. Code Quality

3. Architecture & Design

4. Security & Safety

5. Testing & Documentation

6. Specific Suggestions

7. Approval Status

Uh oh!

hsliuustc0106 commented Mar 12, 2026

Uh oh!

RuixiangMa commented Mar 12, 2026

Uh oh!

SamitHuang commented Mar 13, 2026

Uh oh!

RuixiangMa commented Mar 13, 2026

Uh oh!

SamitHuang left a comment

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

RuixiangMa commented Feb 18, 2026 •

edited

Loading

lishunyang12 left a comment •

edited

Loading