[Bugfix] LoRA: extend expert base_layer loading to Qwen3.5 and Step3.x by HollowMan6 · Pull Request #37114 · vllm-project/vllm

HollowMan6 · 2026-03-15T17:45:58Z

Purpose

This PR extends #31104 to the remaining
model-specific MoE loaders that still hardcode expert
parameter names without .base_layer during weight loading.

#31104 fixed the shared LoRA expert-loading path, but these loaders
still build their own expert remapping tables:

Qwen3.5
Qwen3.5 MTP
Qwen3-VL MoE
Step3 Text
Step3.5
Step3.5 MTP
Detect whether the local parameter set contains .base_layer. expert parameters.
Conditionally insert base_layer. into the expert remapping entries for the affected loaders.
Keep the non-LoRA path unchanged when base_layer is absent.

This preserves existing checkpoint-loading behavior for regular models while allowing LoRA-wrapped expert weights to resolve correctly.

Test Plan

End to end tests together with verl-project/verl#5599

Test Result

Weight sync successfully.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request introduces a fix for LoRA support in Qwen3.5 models, specifically addressing issues with GDN layers and enabling base_layer for experts. The changes involve routing layers with multiple output slices to a more flexible LoRA implementation and adjusting how dummy LoRA weights are created to handle mismatches between logical and physical layer structures. The approach is sound and effectively resolves the described issues. I have one suggestion to improve the maintainability of the new logic for creating dummy LoRA weights.

vllm/lora/model_manager.py

mergify · 2026-03-15T17:49:29Z

Hi @HollowMan6, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

Copilot

Pull request overview

This PR fixes LoRA initialization for Qwen3.5 GDN merged projections where the underlying layer has 3+ physical output slices (e.g., q/k/v/z) but checkpoints expose fewer logical packed LoRA modules (e.g., in_proj_qkv, in_proj_z). It also updates multiple MoE model weight loaders to correctly handle expert weights when LoRA wraps experts behind a base_layer.

Changes:

Route MergedColumnParallelLinear layers with 3+ internal slices to the variable-slice LoRA implementation and expand grouped packed LoRA lists to per-slice tensors during set_lora.
Fix dummy LoRA creation for packed modules by deriving output dims from grouped output_sizes (so dummy adapters match the merged layer’s physical slice layout).
Update Step3/Step3.5/Qwen3(.5)/Qwen3-VL MoE weight loading mappings to support expert base_layer parameter names.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`vllm/model_executor/models/step3p5_mtp.py`	Adjust expert weight mapping to optionally include `.base_layer.` in parameter names.
`vllm/model_executor/models/step3p5.py`	Same expert `base_layer`-aware mapping for Step3.5 main model weight loading.
`vllm/model_executor/models/step3_text.py`	Same expert `base_layer`-aware mapping for Step3 text model weight loading.
`vllm/model_executor/models/qwen3_vl_moe.py`	Make fused expert mapping support `experts.base_layer.*` parameter naming.
`vllm/model_executor/models/qwen3_5_mtp.py`	Make fused expert mapping support `experts.base_layer.*` parameter naming.
`vllm/model_executor/models/qwen3_5.py`	Make fused expert mapping support `experts.base_layer.*` parameter naming.
`vllm/lora/model_manager.py`	Build dummy packed LoRA weights using grouped output dims derived from `base_layer.output_sizes`.
`vllm/lora/layers/column_parallel_linear.py`	Prefer variable-slice LoRA for 3+ slice merged layers and expand grouped packed LoRA lists into per-slice tensors in `set_lora`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

vllm/lora/model_manager.py

vllm/lora/layers/column_parallel_linear.py

jeejeelee · 2026-03-16T01:10:18Z

Please see: #36976 and #37019

jeejeelee · 2026-03-16T01:35:54Z

We are working on fixing Qwen35+LoRA. In order not to waste your valuable time, please stop further development on this PR. Thank you.

jeejeelee · 2026-03-16T01:41:45Z

vllm/model_executor/models/step3_text.py

-            (".moe.experts.w2_weight", ".moe.down_proj.weight", "w2"),
+            (f".moe.experts.{base_layer}w13_weight", ".moe.gate_proj.weight", "w1"),
+            (f".moe.experts.{base_layer}w13_weight", ".moe.up_proj.weight", "w3"),
+            (f".moe.experts.{base_layer}w2_weight", ".moe.down_proj.weight", "w2"),


It looks like these changes are unrelated. If needed, please submit a separate PR. Thank you.

Thank you @jeejeelee for letting me know #36976 and #37019! I just removed all the Qwen3.5 LoRA fix in this PR and focus on the changes you mentioned here, those are extensions to #31104, please let me know what you think, thanks!

This PR extends vllm-project#31104 to the remaining model-specific MoE loaders that still hardcode expert parameter names without `.base_layer` during weight loading. `vllm-project#31104` fixed the shared LoRA expert-loading path, but these loaders still build their own expert remapping tables: - `Qwen3.5` - `Qwen3.5 MTP` - `Qwen3-VL MoE` - `Step3 Text` - `Step3.5` - `Step3.5 MTP` - Detect whether the local parameter set contains `.base_layer.` expert parameters. - Conditionally insert `base_layer.` into the expert remapping entries for the affected loaders. - Keep the non-LoRA path unchanged when `base_layer` is absent. This preserves existing checkpoint-loading behavior for regular models while allowing LoRA-wrapped expert weights to resolve correctly. Signed-off-by: Hollow Man <hollowman@opensuse.org>

HollowMan6 · 2026-03-16T13:03:45Z

@codex review

Copilot

Pull request overview

Extends the model-specific MoE weight-loading remapping logic to correctly resolve LoRA-wrapped expert weights that include an intermediate .base_layer. namespace (following the earlier shared-path fix in #31104), while preserving existing behavior for non-LoRA checkpoints.

Changes:

Detect presence of .base_layer. parameters at load time and conditionally prefix expert parameter names with base_layer. during remapping.
Apply the conditional remapping to the remaining loaders that hardcoded expert parameter names for Qwen3.5 / Qwen3-VL MoE and Step3.x variants.
Keep the non-LoRA loading path unchanged when .base_layer. is not present.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
`vllm/model_executor/models/step3p5_mtp.py`	Adds conditional `base_layer.` prefix to Step3.5 MTP expert remapping entries.
`vllm/model_executor/models/step3p5.py`	Adds conditional `base_layer.` prefix to Step3.5 expert remapping entries (packed 3D expert weights).
`vllm/model_executor/models/step3_text.py`	Adds conditional `base_layer.` prefix to Step3 Text expert remapping entries.
`vllm/model_executor/models/qwen3_vl_moe.py`	Adds conditional `base_layer.` prefix for Qwen3-VL MoE fused expert remapping entries.
`vllm/model_executor/models/qwen3_5_mtp.py`	Adds conditional `base_layer.` prefix for Qwen3.5 MTP fused expert remapping entries.
`vllm/model_executor/models/qwen3_5.py`	Adds conditional `base_layer.` prefix for Qwen3.5 fused expert remapping entries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

gemini-code-assist

Code Review

This pull request extends LoRA expert weight loading to several Qwen and Step models, which is a necessary bugfix. The implementation correctly detects the presence of .base_layer. in parameter names and dynamically adjusts the expert weight mappings. The changes are consistent and well-targeted across all affected model files. The approach is clean and effectively resolves the issue for LoRA-wrapped expert weights.

chatgpt-codex-connector · 2026-03-16T13:11:12Z

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

HollowMan6 requested review from jeejeelee and sighingnow as code owners March 15, 2026 17:45

Copilot AI review requested due to automatic review settings March 15, 2026 17:45

Copilot started reviewing on behalf of HollowMan6 March 15, 2026 17:46 View session

mergify bot added qwen Related to Qwen models bug Something isn't working labels Mar 15, 2026

gemini-code-assist bot reviewed Mar 15, 2026

View reviewed changes

vllm/lora/model_manager.py Outdated Show resolved Hide resolved

Copilot AI reviewed Mar 15, 2026

View reviewed changes

vllm/lora/model_manager.py Show resolved Hide resolved

vllm/lora/layers/column_parallel_linear.py Outdated Show resolved Hide resolved

HollowMan6 force-pushed the lora_qwen3.5 branch from ae1714b to 9f6c229 Compare March 15, 2026 18:19

HollowMan6 mentioned this pull request Mar 15, 2026

[megatron] fix: Qwen3.5 LoRA & MTP support (with Megatron-Bridge) verl-project/verl#5599

Open

8 tasks

jeejeelee reviewed Mar 16, 2026

View reviewed changes

HollowMan6 force-pushed the lora_qwen3.5 branch from 9f6c229 to 7d1a8d1 Compare March 16, 2026 12:58

HollowMan6 changed the title ~~[Bugfix] LoRA support for Qwen3.5 (GDN layers)~~ [Bugfix] LoRA: extend expert base_layer loading to Qwen3.5 and Step3.x Mar 16, 2026

HollowMan6 requested review from Copilot and jeejeelee March 16, 2026 13:03

Copilot started reviewing on behalf of HollowMan6 March 16, 2026 13:04 View session

Copilot AI reviewed Mar 16, 2026

View reviewed changes

gemini-code-assist bot reviewed Mar 16, 2026

View reviewed changes

Uh oh!

Conversation

HollowMan6 commented Mar 15, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mergify bot commented Mar 15, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

jeejeelee commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeejeelee commented Mar 16, 2026

Uh oh!

jeejeelee Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

HollowMan6 Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

HollowMan6 commented Mar 16, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

chatgpt-codex-connector bot commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HollowMan6 commented Mar 15, 2026 •

edited by github-actions bot

Loading

jeejeelee commented Mar 16, 2026 •

edited

Loading