Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions vllm/model_executor/models/bailing_moe_linear.py
Original file line number Diff line number Diff line change
Expand Up @@ -205,13 +205,19 @@ def __init__(
self.q_a_layernorm = None
self.q_b_proj = None

rope_parameters = _build_rope_parameters(config)
rope_parameters = _build_rope_parameters(config) or {}
# MLA rotates the full qk_rope_head_dim,
# partial_rotary_factor is for the linear-attn head only.
rope_parameters = {
k: v for k, v in rope_parameters.items() if k != "partial_rotary_factor"
}
rope_parameters["rope_dim"] = self.qk_rope_head_dim
Comment on lines +211 to +214
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The key rope_dim is not standard in vLLM's get_rope utility; the expected key to control the number of rotated dimensions is rotary_dim. While using rope_dim might appear to work because rotary_dim defaults to head_size (which is set to self.qk_rope_head_dim here), it would fail to override an existing rotary_dim if one were present in the model configuration. Additionally, the logic for filtering partial_rotary_factor can be simplified using .pop() for better readability.

        rope_parameters = _build_rope_parameters(config) or {}
        rope_parameters.pop("partial_rotary_factor", None)
        rope_parameters["rotary_dim"] = self.qk_rope_head_dim

Comment thread
ZJY0516 marked this conversation as resolved.
max_position = getattr(config, "max_position_embeddings", 8192)
self.rotary_emb = get_rope(
head_size=self.qk_rope_head_dim,
max_position=max_position,
is_neox_style=False,
rope_parameters=rope_parameters or None,
rope_parameters=rope_parameters,
)

# Build MLAModules for MultiHeadLatentAttentionWrapper
Expand Down
Loading