[Bugfix] Add MTP for opanpangu_pro_moe model, fix an initialization bug in StaticSinkAttention by yt0428 · Pull Request #32508 · vllm-project/vllm

yt0428 · 2026-01-17T08:17:09Z

Purpose

This PR further add MTP for opanpangu_pro_moe model #28775 , and it also fix an initialization bug in StaticSinkAttention

For MTP support, the major modification is that we need to do a shallow copy for spec_decode_common_attn_metadata in gpu_model_runner.py. The block_table_tensor of common_metadata may be modified during the building of StaticSinkAttention, this will, in turn, affect spec_decode_common_attn_metadata, as it is a direct reference to common_metadata. A simple shallow copy can avoid this.

As for the initialization bug in StaticSinkAttention, we move the init of CustomOp to the beginning of the initialization to avoid the refresh of member variables.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

…ticSinkAttention Signed-off-by: yuantao <2422264527@qq.com>

gemini-code-assist

Code Review

This pull request introduces support for the opanpangu_pro_moe model and addresses an initialization bug in StaticSinkAttention. The changes include adding the new model type to speculative configuration and model architecture convertors, as well as refining the weight loading process for the openpangu_mtp model. A critical fix involves ensuring a shallow copy of attention metadata to prevent unintended side effects during speculative decoding initialization.

vllm/model_executor/layers/attention/static_sink_attention.py

vllm/v1/worker/gpu_model_runner.py

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

Comment @cursor review or bugbot run to trigger another review on this PR

vllm/config/speculative.py

yt0428 · 2026-02-10T06:19:19Z

@DarkLight1337 @WoosukKwon @youkaichao @robertgshaw2-redhat @mgoin @tlrmchlsmth @houseroad @hmellor @yewentao256 @ProExpertProg
Hello, could you please give some reviews about this small PR? Many Thanks!!!

mergify · 2026-02-12T07:36:54Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @yt0428.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: yt0428 <51468697+yt0428@users.noreply.github.com>

mergify · 2026-03-03T16:17:17Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @yt0428.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

LucasWilkinson · 2026-03-04T05:55:34Z

vllm/v1/worker/gpu_model_runner.py

+                        spec_decode_common_attn_metadata = copy(cm)
                else:
-                    spec_decode_common_attn_metadata = cm
+                    spec_decode_common_attn_metadata = copy(cm)


can we move this copy into StaticSinkAttentionBuilder?

Unfortunately the builder is not responsible for the building of spec_decode_common_attn_metadata, it is handled by gpu_model_runner outside.

Add MTP for opanpangu_pro_moe model, fix an initialization bug in Sta…

c692771

…ticSinkAttention Signed-off-by: yuantao <2422264527@qq.com>

yt0428 requested review from ProExpertProg, WoosukKwon, hmellor, houseroad, mgoin, robertgshaw2-redhat, tlrmchlsmth, yewentao256 and youkaichao as code owners January 17, 2026 08:17

mergify bot added v1 bug Something isn't working labels Jan 17, 2026

gemini-code-assist bot reviewed Jan 17, 2026

View reviewed changes

vllm/model_executor/layers/attention/static_sink_attention.py Show resolved Hide resolved

vllm/v1/worker/gpu_model_runner.py Show resolved Hide resolved

cursor bot reviewed Jan 17, 2026

View reviewed changes

vllm/config/speculative.py Show resolved Hide resolved

yt0428 changed the title ~~Add MTP for opanpangu_pro_moe model, fix an initialization bug in StaticSinkAttention~~ [Bugfix] Add MTP for opanpangu_pro_moe model, fix an initialization bug in StaticSinkAttention Feb 12, 2026

mergify bot added the needs-rebase label Feb 12, 2026

Merge branch 'main' into Add_MTP_for_openpangu_and_bugfix

4a046f8

Signed-off-by: yt0428 <51468697+yt0428@users.noreply.github.com>

yt0428 requested review from LucasWilkinson and MatthewBonanni as code owners February 26, 2026 09:17

mergify bot removed the needs-rebase label Feb 26, 2026

mergify bot added the needs-rebase label Mar 3, 2026

LucasWilkinson reviewed Mar 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Add MTP for opanpangu_pro_moe model, fix an initialization bug in StaticSinkAttention#32508

[Bugfix] Add MTP for opanpangu_pro_moe model, fix an initialization bug in StaticSinkAttention#32508
yt0428 wants to merge 2 commits intovllm-project:mainfrom
yt0428:Add_MTP_for_openpangu_and_bugfix

yt0428 commented Jan 17, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

yt0428 commented Feb 10, 2026

Uh oh!

mergify bot commented Feb 12, 2026

Uh oh!

mergify bot commented Mar 3, 2026

Uh oh!

LucasWilkinson Mar 4, 2026

Uh oh!

yt0428 Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

yt0428 commented Jan 17, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yt0428 commented Feb 10, 2026

Uh oh!

mergify bot commented Feb 12, 2026

Uh oh!

mergify bot commented Mar 3, 2026

Uh oh!

LucasWilkinson Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

yt0428 Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yt0428 commented Jan 17, 2026 •

edited by github-actions bot

Loading