[Feat] add GLM-Image SP support#1983
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fc8af82f52
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@wtomin @SamitHuang @ZJY0516 PTAL |
| from vllm_omni.diffusion.data import DiffusionParallelConfig | ||
|
|
||
|
|
||
| def test_glm_image_sp_plan_defined(): |
There was a problem hiding this comment.
please add mark, you can refer to: https://github.com/vllm-project/vllm-omni/blob/main/docs/contributing/ci/tests_markers.md
lishunyang12
left a comment
There was a problem hiding this comment.
Left a couple comments:
-
to_outnot applied to text states in SP path — In the non-SP path,to_outis applied to the combined[text, image]tensor before splitting. In the SP path, the joint output is split first andto_outis only applied tohidden_states_out(image). The text stream (encoder_hidden_states_out) skips the learned linear projection entirely. This will produce different results from non-SP. (glm_image_transformer.py, SP attention forward) -
Side-effect attributes in
GlmImagePrepare.forward()—post_patch_height/post_patch_widthare stored as module attributes during forward, but don't exist until after the first call. Returning them alongside the sharded tensors would be cleaner.
Signed-off-by: Lancer <maruixiang6688@gmail.com>
Signed-off-by: Lancer <maruixiang6688@gmail.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
In SP path we now apply self.to_out on the joint [text, image] output before splitting, matching non-SP behavior, so text states also receive the projection. GlmImagePrepare.forward() now returns post_patch_height/post_patch_width explicitly instead of storing forward-time side-effect attributes on the module. |
| post_patch_height = height // self.patch_size | ||
| post_patch_width = width // self.patch_size | ||
| post_patch_height = torch.tensor(height // self.patch_size, device=hidden_states.device, dtype=torch.int64) | ||
| post_patch_width = torch.tensor(width // self.patch_size, device=hidden_states.device, dtype=torch.int64) |
There was a problem hiding this comment.
Any benefits from turning them to tensors?
There was a problem hiding this comment.
Keeping them as tensors is currently necessary for the SP path,prepare() is used with _sp_plan and split_output=True, and the current sp output hook only handles a tensor or a tuple/list of tensors. Converting post_patch_height / post_patch_width to ints makes the output a mixed tuple, which causes the hook to skip output sharding entirely and breaks the SP path.
# Conflicts: # vllm_omni/diffusion/models/glm_image/glm_image_transformer.py
|
@RuixiangMa I heard that #2167 has some conflicts and won't be merged very soon. I think the current unit test will be sufficient for now. LGTM. |
|
|
fixed |
|
@scyiwei1986 it seems the image upload failed. Could you please open an issue and include the test configuration? I don’t have Ascend device for testing, I just ran a quick test and it works correctly on NVIDIA GPUs. |
|
#2814. I create a issue here. |
|
can you help check whether ar(tp=2) is compatible with dit(sp=2)? |
It is ok, actually, that's how I tested it - AR with tp enabled |
|
@RuixiangMa Hi, Lancer, can we talk on WeChat? I left you a message in your gmail. Pls check it. |
Signed-off-by: Lancer <maruixiang6688@gmail.com> Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com> Co-authored-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Purpose
Test Plan
Test Result
This PR is linked to another TP-related PR #1918
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)