[Model] Add shared_head to prefix of SharedHead. by whx-sjtu · Pull Request #27193 · vllm-project/vllm

whx-sjtu · 2025-10-20T12:31:22Z

Purpose

Add shared_head to prefix of SharedHead which is used in quantization scenarios to determine quant type of certain layer.

Test Plan

No extra test needed.

Test Result

All ci should pass.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: whx-sjtu <2952154980@qq.com>

gemini-code-assist

Code Review

This pull request correctly adds the shared_head prefix when instantiating SharedHead in both deepseek_mtp.py and glm4_moe_mtp.py. This change is necessary for quantization scenarios to correctly determine the quantization type for the layer. The implementation is straightforward and correct. However, I've identified a code duplication issue with the SharedHead class itself, which is defined identically in both modified files. I've left a comment with a suggestion to refactor this for better maintainability.

gemini-code-assist · 2025-10-20T12:32:33Z

@@ -75,7 +75,9 @@ def __init__(self, vllm_config: VllmConfig, prefix: str) -> None:
            topk_indices_buffer = None

        self.shared_head = SharedHead(


While the change to add a prefix is correct, I've noticed that the SharedHead class is duplicated. An identical implementation exists in vllm/model_executor/models/glm4_moe_mtp.py. To adhere to the DRY (Don't Repeat Yourself) principle and improve long-term maintainability, this class should be defined in a single, shared location, such as vllm/model_executor/models/utils.py, and then imported where needed. This would prevent future inconsistencies and simplify changes like the one in this PR, as it would only need to be applied once.

github-actions · 2026-01-19T02:13:34Z

This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you!

github-actions · 2026-02-18T02:16:57Z

This pull request has been automatically closed due to inactivity. Please feel free to reopen if you intend to continue working on it. Thank you!

add shared_head to prefix of SharedHead

ef00f30

Signed-off-by: whx-sjtu <2952154980@qq.com>

whx-sjtu requested a review from luccafong as a code owner October 20, 2025 12:31

mergify Bot added the deepseek Related to DeepSeek models label Oct 20, 2025

gemini-code-assist Bot reviewed Oct 20, 2025

View reviewed changes

whx-sjtu mentioned this pull request Oct 20, 2025

[Model][2/N] Remove deepseek_mtp modeling. vllm-project/vllm-ascend#3561

Merged

github-actions Bot added the stale Over 90 days of inactivity label Jan 19, 2026

github-actions Bot closed this Feb 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Model] Add shared_head to prefix of SharedHead.#27193

[Model] Add shared_head to prefix of SharedHead.#27193
whx-sjtu wants to merge 1 commit intovllm-project:mainfrom
whx-sjtu:shared_head_prefix

whx-sjtu commented Oct 20, 2025 •

edited by github-actions Bot

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Oct 20, 2025

Uh oh!

github-actions Bot commented Jan 19, 2026

Uh oh!

github-actions Bot commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		@@ -75,7 +75,9 @@ def __init__(self, vllm_config: VllmConfig, prefix: str) -> None:
		topk_indices_buffer = None

		self.shared_head = SharedHead(

Uh oh!

Conversation

whx-sjtu commented Oct 20, 2025 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

All ci should pass.

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jan 19, 2026

Uh oh!

github-actions Bot commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

whx-sjtu commented Oct 20, 2025 •

edited by github-actions Bot

Loading