[Model] Add shared_head to prefix of SharedHead.#27193
[Model] Add shared_head to prefix of SharedHead.#27193whx-sjtu wants to merge 1 commit intovllm-project:mainfrom
Conversation
Signed-off-by: whx-sjtu <2952154980@qq.com>
There was a problem hiding this comment.
Code Review
This pull request correctly adds the shared_head prefix when instantiating SharedHead in both deepseek_mtp.py and glm4_moe_mtp.py. This change is necessary for quantization scenarios to correctly determine the quantization type for the layer. The implementation is straightforward and correct. However, I've identified a code duplication issue with the SharedHead class itself, which is defined identically in both modified files. I've left a comment with a suggestion to refactor this for better maintainability.
| @@ -75,7 +75,9 @@ def __init__(self, vllm_config: VllmConfig, prefix: str) -> None: | |||
| topk_indices_buffer = None | |||
|
|
|||
| self.shared_head = SharedHead( | |||
There was a problem hiding this comment.
While the change to add a prefix is correct, I've noticed that the SharedHead class is duplicated. An identical implementation exists in vllm/model_executor/models/glm4_moe_mtp.py. To adhere to the DRY (Don't Repeat Yourself) principle and improve long-term maintainability, this class should be defined in a single, shared location, such as vllm/model_executor/models/utils.py, and then imported where needed. This would prevent future inconsistencies and simplify changes like the one in this PR, as it would only need to be applied once.
|
This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you! |
|
This pull request has been automatically closed due to inactivity. Please feel free to reopen if you intend to continue working on it. Thank you! |
Purpose
Add shared_head to prefix of
SharedHeadwhich is used in quantization scenarios to determine quant type of certain layer.Test Plan
No extra test needed.
Test Result
All ci should pass.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.