[Feat] add GLM-Image SP support by RuixiangMa · Pull Request #1983 · vllm-project/vllm-omni

RuixiangMa · 2026-03-18T12:54:50Z

Purpose

Test Plan

Test Result

Configuration	Ulysses Degree	Ring Degree	Generation Time (s)	Speedup
Baseline	1	1	47.285	1.00x
Ulysses	4	1	23.512	2.01x
Ring	1	4	23.512	2.01x
Hybrid (Ulysses + Ring)	2	2	21.106	2.24x

This PR is linked to another TP-related PR #1918

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc8af82f52

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Gaohan123 · 2026-03-19T00:15:55Z

@wtomin @SamitHuang @ZJY0516 PTAL

yenuo26 · 2026-03-19T03:29:39Z

+from vllm_omni.diffusion.data import DiffusionParallelConfig
+
+
+def test_glm_image_sp_plan_defined():


please add mark, you can refer to: https://github.com/vllm-project/vllm-omni/blob/main/docs/contributing/ci/tests_markers.md

lishunyang12

Left a couple comments:

to_out not applied to text states in SP path — In the non-SP path, to_out is applied to the combined [text, image] tensor before splitting. In the SP path, the joint output is split first and to_out is only applied to hidden_states_out (image). The text stream (encoder_hidden_states_out) skips the learned linear projection entirely. This will produce different results from non-SP. (glm_image_transformer.py, SP attention forward)
Side-effect attributes in GlmImagePrepare.forward() — post_patch_height/post_patch_width are stored as module attributes during forward, but don't exist until after the first call. Returning them alongside the sharded tensors would be cleaner.

Signed-off-by: Lancer <maruixiang6688@gmail.com>

Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>

Signed-off-by: Lancer <maruixiang6688@gmail.com>

RuixiangMa · 2026-03-27T09:39:45Z

Left a couple comments:

to_out not applied to text states in SP path — In the non-SP path, to_out is applied to the combined [text, image] tensor before splitting. In the SP path, the joint output is split first and to_out is only applied to hidden_states_out (image). The text stream (encoder_hidden_states_out) skips the learned linear projection entirely. This will produce different results from non-SP. (glm_image_transformer.py, SP attention forward)

Side-effect attributes in GlmImagePrepare.forward() — post_patch_height/post_patch_width are stored as module attributes during forward, but don't exist until after the first call. Returning them alongside the sharded tensors would be cleaner.

In SP path we now apply self.to_out on the joint [text, image] output before splitting, matching non-SP behavior, so text states also receive the projection.

GlmImagePrepare.forward() now returns post_patch_height/post_patch_width explicitly instead of storing forward-time side-effect attributes on the module.

wtomin · 2026-04-01T13:14:45Z

-        post_patch_height = height // self.patch_size
-        post_patch_width = width // self.patch_size
+        post_patch_height = torch.tensor(height // self.patch_size, device=hidden_states.device, dtype=torch.int64)
+        post_patch_width = torch.tensor(width // self.patch_size, device=hidden_states.device, dtype=torch.int64)


Any benefits from turning them to tensors?

Keeping them as tensors is currently necessary for the SP path,prepare() is used with _sp_plan and split_output=True, and the current sp output hook only handles a tensor or a tuple/list of tensors. Converting post_patch_height / post_patch_width to ints makes the output a mixed tuple, which causes the hook to skip output sharding entirely and breaks the SP path.

# Conflicts: # vllm_omni/diffusion/models/glm_image/glm_image_transformer.py

wtomin · 2026-04-08T12:20:14Z

Missing e2e test for GLM SP support. Please add a test that covers --ulysses-degree 2 or --ring-degree 2. For L4 test, please refer to #1832 . In nightly diffusion tests, currently we use H100 machines with 2 cards.

Currently, #2167 wrote a test script for GLM-Image. Please wait until it's merged.

wtomin · 2026-04-14T08:08:35Z

@RuixiangMa I heard that #2167 has some conflicts and won't be merged very soon. I think the current unit test will be sufficient for now.

LGTM.

wtomin · 2026-04-14T08:57:25Z

tests/diffusion/models/glm_image/test_glm_image_sp.py failed in the CI https://buildkite.com/vllm/vllm-omni/builds/6621/steps/canvas, please take a look.

Signed-off-by: Lancer <maruixiang6688@gmail.com>

RuixiangMa · 2026-04-14T14:16:53Z

tests/diffusion/models/glm_image/test_glm_image_sp.py failed in the CI https://buildkite.com/vllm/vllm-omni/builds/6621/steps/canvas, please take a look.

fixed

scyiwei1986 · 2026-04-15T03:29:32Z

i use this sp pull on ascend,the image is divide to 3 parts, each part contains the image I want like this.

RuixiangMa · 2026-04-15T04:41:37Z

@scyiwei1986 it seems the image upload failed. Could you please open an issue and include the test configuration?

I don’t have Ascend device for testing, I just ran a quick test and it works correctly on NVIDIA GPUs.

scyiwei1986 · 2026-04-15T07:14:53Z

#2814. I create a issue here.

hsliuustc0106 · 2026-04-15T07:19:04Z

can you help check whether ar(tp=2) is compatible with dit(sp=2)？

RuixiangMa · 2026-04-15T08:35:51Z

can you help check whether ar(tp=2) is compatible with dit(sp=2)？

It is ok, actually, that's how I tested it - AR with tp enabled

wtomin · 2026-04-16T06:40:51Z

@RuixiangMa Hi, Lancer, can we talk on WeChat? I left you a message in your gmail. Pls check it.

Signed-off-by: Lancer <maruixiang6688@gmail.com> Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com> Co-authored-by: Didan Deng <33117903+wtomin@users.noreply.github.com>

RuixiangMa requested a review from hsliuustc0106 as a code owner March 18, 2026 12:54

chatgpt-codex-connector Bot reviewed Mar 18, 2026

View reviewed changes

Comment thread vllm_omni/diffusion/models/glm_image/glm_image_transformer.py Outdated

Comment thread vllm_omni/diffusion/models/glm_image/glm_image_transformer.py Outdated

RuixiangMa force-pushed the glmspkv branch from fc8af82 to 0857202 Compare March 18, 2026 14:21

RuixiangMa changed the title ~~add GLM-Image SP support~~ [Feat] add GLM-Image SP support Mar 18, 2026

wtomin mentioned this pull request Mar 19, 2026

[RFC]: Continuous Diffusion Model Acceleration Support #1217

Open

1 task

yenuo26 reviewed Mar 19, 2026

View reviewed changes

lishunyang12 reviewed Mar 21, 2026

View reviewed changes

wtomin reviewed Mar 24, 2026

View reviewed changes

Comment thread tests/diffusion/models/glm_image/test_glm_image_sp.py

RuixiangMa and others added 8 commits March 27, 2026 09:56

[Feat]: add GLM-Image SP support

0556b51

Signed-off-by: Lancer <maruixiang6688@gmail.com>

upd

38ba652

Signed-off-by: Lancer <maruixiang6688@gmail.com>

upd

1b74b8f

Signed-off-by: Lancer <maruixiang6688@gmail.com>

Merge branch 'main' into glmspkv

5d7d6e9

Signed-off-by: Lancer <maruixiang6688@gmail.com>

add pytest marker

21d0714

Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>

Merge remote-tracking branch 'origin/main' into glmspkv

f0a0944

Merge remote-tracking branch 'origin/main' into glmspkv

56930fa

upd

ac61234

Signed-off-by: Lancer <maruixiang6688@gmail.com>

RuixiangMa force-pushed the glmspkv branch from 6566810 to ac61234 Compare March 27, 2026 04:39

upd

4b50851

Signed-off-by: Lancer <maruixiang6688@gmail.com>

RuixiangMa force-pushed the glmspkv branch from 7bc7298 to 4b50851 Compare March 27, 2026 09:31

wtomin reviewed Apr 1, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into glmspkv

c29fde6

# Conflicts: # vllm_omni/diffusion/models/glm_image/glm_image_transformer.py

wtomin approved these changes Apr 14, 2026

View reviewed changes

wtomin added the ready label to trigger buildkite CI label Apr 14, 2026

upd

faa4f3e

Signed-off-by: Lancer <maruixiang6688@gmail.com>

hsliuustc0106 merged commit b43c6c6 into vllm-project:main Apr 16, 2026
8 checks passed

		from vllm_omni.diffusion.data import DiffusionParallelConfig


		def test_glm_image_sp_plan_defined():

Conversation

RuixiangMa commented Mar 18, 2026

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Gaohan123 commented Mar 19, 2026

Uh oh!

yenuo26 Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

lishunyang12 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

RuixiangMa commented Mar 27, 2026

Uh oh!

wtomin Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

RuixiangMa Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

wtomin commented Apr 8, 2026

Uh oh!

wtomin commented Apr 14, 2026

Uh oh!

wtomin commented Apr 14, 2026

Uh oh!

RuixiangMa commented Apr 14, 2026

Uh oh!

scyiwei1986 commented Apr 15, 2026

Uh oh!

RuixiangMa commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scyiwei1986 commented Apr 15, 2026

Uh oh!

hsliuustc0106 commented Apr 15, 2026

Uh oh!

RuixiangMa commented Apr 15, 2026

Uh oh!

Uh oh!

wtomin commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

RuixiangMa commented Apr 15, 2026 •

edited

Loading