[Bugfix] LoRA: extend expert base_layer loading to Qwen3.5 and Step3.x#37114
[Bugfix] LoRA: extend expert base_layer loading to Qwen3.5 and Step3.x#37114HollowMan6 wants to merge 1 commit intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a fix for LoRA support in Qwen3.5 models, specifically addressing issues with GDN layers and enabling base_layer for experts. The changes involve routing layers with multiple output slices to a more flexible LoRA implementation and adjusting how dummy LoRA weights are created to handle mismatches between logical and physical layer structures. The approach is sound and effectively resolves the described issues. I have one suggestion to improve the maintainability of the new logic for creating dummy LoRA weights.
|
Hi @HollowMan6, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
There was a problem hiding this comment.
Pull request overview
This PR fixes LoRA initialization for Qwen3.5 GDN merged projections where the underlying layer has 3+ physical output slices (e.g., q/k/v/z) but checkpoints expose fewer logical packed LoRA modules (e.g., in_proj_qkv, in_proj_z). It also updates multiple MoE model weight loaders to correctly handle expert weights when LoRA wraps experts behind a base_layer.
Changes:
- Route
MergedColumnParallelLinearlayers with 3+ internal slices to the variable-slice LoRA implementation and expand grouped packed LoRA lists to per-slice tensors duringset_lora. - Fix dummy LoRA creation for packed modules by deriving output dims from grouped
output_sizes(so dummy adapters match the merged layer’s physical slice layout). - Update Step3/Step3.5/Qwen3(.5)/Qwen3-VL MoE weight loading mappings to support expert
base_layerparameter names.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
vllm/model_executor/models/step3p5_mtp.py |
Adjust expert weight mapping to optionally include .base_layer. in parameter names. |
vllm/model_executor/models/step3p5.py |
Same expert base_layer-aware mapping for Step3.5 main model weight loading. |
vllm/model_executor/models/step3_text.py |
Same expert base_layer-aware mapping for Step3 text model weight loading. |
vllm/model_executor/models/qwen3_vl_moe.py |
Make fused expert mapping support experts.base_layer.* parameter naming. |
vllm/model_executor/models/qwen3_5_mtp.py |
Make fused expert mapping support experts.base_layer.* parameter naming. |
vllm/model_executor/models/qwen3_5.py |
Make fused expert mapping support experts.base_layer.* parameter naming. |
vllm/lora/model_manager.py |
Build dummy packed LoRA weights using grouped output dims derived from base_layer.output_sizes. |
vllm/lora/layers/column_parallel_linear.py |
Prefer variable-slice LoRA for 3+ slice merged layers and expand grouped packed LoRA lists into per-slice tensors in set_lora. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ae1714b to
9f6c229
Compare
|
We are working on fixing Qwen35+LoRA. In order not to waste your valuable time, please stop further development on this PR. Thank you. |
| (".moe.experts.w2_weight", ".moe.down_proj.weight", "w2"), | ||
| (f".moe.experts.{base_layer}w13_weight", ".moe.gate_proj.weight", "w1"), | ||
| (f".moe.experts.{base_layer}w13_weight", ".moe.up_proj.weight", "w3"), | ||
| (f".moe.experts.{base_layer}w2_weight", ".moe.down_proj.weight", "w2"), |
There was a problem hiding this comment.
It looks like these changes are unrelated. If needed, please submit a separate PR. Thank you.
There was a problem hiding this comment.
Thank you @jeejeelee for letting me know #36976 and #37019! I just removed all the Qwen3.5 LoRA fix in this PR and focus on the changes you mentioned here, those are extensions to #31104, please let me know what you think, thanks!
This PR extends vllm-project#31104 to the remaining model-specific MoE loaders that still hardcode expert parameter names without `.base_layer` during weight loading. `vllm-project#31104` fixed the shared LoRA expert-loading path, but these loaders still build their own expert remapping tables: - `Qwen3.5` - `Qwen3.5 MTP` - `Qwen3-VL MoE` - `Step3 Text` - `Step3.5` - `Step3.5 MTP` - Detect whether the local parameter set contains `.base_layer.` expert parameters. - Conditionally insert `base_layer.` into the expert remapping entries for the affected loaders. - Keep the non-LoRA path unchanged when `base_layer` is absent. This preserves existing checkpoint-loading behavior for regular models while allowing LoRA-wrapped expert weights to resolve correctly. Signed-off-by: Hollow Man <hollowman@opensuse.org>
9f6c229 to
7d1a8d1
Compare
|
@codex review |
There was a problem hiding this comment.
Pull request overview
Extends the model-specific MoE weight-loading remapping logic to correctly resolve LoRA-wrapped expert weights that include an intermediate .base_layer. namespace (following the earlier shared-path fix in #31104), while preserving existing behavior for non-LoRA checkpoints.
Changes:
- Detect presence of
.base_layer.parameters at load time and conditionally prefix expert parameter names withbase_layer.during remapping. - Apply the conditional remapping to the remaining loaders that hardcoded expert parameter names for Qwen3.5 / Qwen3-VL MoE and Step3.x variants.
- Keep the non-LoRA loading path unchanged when
.base_layer.is not present.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
vllm/model_executor/models/step3p5_mtp.py |
Adds conditional base_layer. prefix to Step3.5 MTP expert remapping entries. |
vllm/model_executor/models/step3p5.py |
Adds conditional base_layer. prefix to Step3.5 expert remapping entries (packed 3D expert weights). |
vllm/model_executor/models/step3_text.py |
Adds conditional base_layer. prefix to Step3 Text expert remapping entries. |
vllm/model_executor/models/qwen3_vl_moe.py |
Adds conditional base_layer. prefix for Qwen3-VL MoE fused expert remapping entries. |
vllm/model_executor/models/qwen3_5_mtp.py |
Adds conditional base_layer. prefix for Qwen3.5 MTP fused expert remapping entries. |
vllm/model_executor/models/qwen3_5.py |
Adds conditional base_layer. prefix for Qwen3.5 fused expert remapping entries. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Code Review
This pull request extends LoRA expert weight loading to several Qwen and Step models, which is a necessary bugfix. The implementation correctly detects the presence of .base_layer. in parameter names and dynamically adjusts the expert weight mappings. The changes are consistent and well-targeted across all affected model files. The approach is clean and effectively resolves the issue for LoRA-wrapped expert weights.
|
Codex Review: Didn't find any major issues. Hooray! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Purpose
This PR extends #31104 to the remaining
model-specific MoE loaders that still hardcode expert
parameter names without
.base_layerduring weight loading.#31104fixed the shared LoRA expert-loading path, but these loadersstill build their own expert remapping tables:
Qwen3.5Qwen3.5 MTPQwen3-VL MoEStep3 TextStep3.5Step3.5 MTPDetect whether the local parameter set contains
.base_layer.expert parameters.Conditionally insert
base_layer.into the expert remapping entries for the affected loaders.Keep the non-LoRA path unchanged when
base_layeris absent.This preserves existing checkpoint-loading behavior for regular models while allowing LoRA-wrapped expert weights to resolve correctly.
Test Plan
End to end tests together with verl-project/verl#5599
Test Result
Weight sync successfully.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.