Add support for Qwen3.5 MoE by michaelroyzen · Pull Request #1109 · linkedin/Liger-Kernel

michaelroyzen · 2026-02-26T18:54:43Z

Add Qwen3.5 MoE support to Liger Kernel

Summary

Adds Liger Kernel optimizations for the Qwen3.5 MoE model family (qwen3_5_moe / qwen3_5_moe_text), targeting Transformers v5+
Qwen3.5 MoE combines Qwen3 Next's hybrid GDN/attention architecture with Sparse MoE (shared + routed experts), so the implementation mirrors Qwen3 Next's Liger integration: Gemma-style RMSNorm (LigerRMSNormForQwen3Next), fused SwiGLU experts (LigerExperts), and fused linear cross-entropy loss

Changes

New file:

src/liger_kernel/transformers/model/qwen3_5_moe.py — lce_forward for Qwen3_5MoeForCausalLM, based on the Qwen3 Next version with the load_balancing_loss_func import updated to point to Qwen3.5 MoE's local definition

Modified files:

src/liger_kernel/transformers/monkey_patch.py — apply_liger_kernel_to_qwen3_5_moe function (RMSNorm, SwiGLU experts, fused LCE; RoPE disabled) with instance patching for norm layers, shared expert, and routed experts; registered as qwen3_5_moe and qwen3_5_moe_text in MODEL_TYPE_TO_APPLY_LIGER_FN
src/liger_kernel/transformers/__init__.py — Export apply_liger_kernel_to_qwen3_5_moe in TYPE_CHECKING, __getattr__, and __all__
test/utils.py — revert_liger_kernel_to_qwen3_5_moe for test cleanup
test/convergence/fp32/test_mini_models.py — Availability check, imports, and MiniModelConfig entry for mini_qwen3_5_moe
test/transformers/test_monkey_patch.py — is_qwen3_5_moe_available helper and test_apply_liger_kernel_to_instance_for_qwen3_5_moe verifying all patches are applied correctly

Test plan

test_apply_liger_kernel_to_instance_for_qwen3_5_moe passes (monkey patch instance patching)
mini_qwen3_5_moe convergence test passes (fp32 mini model)
Existing Qwen3 Next and Qwen3 MoE tests still pass (no regressions)

michaelroyzen · 2026-02-26T18:57:42Z

@shimizust @Tcc0403

michaelroyzen · 2026-02-26T19:48:42Z

Convergence test passes

Tcc0403

@Mecoli1219 can you take a look?

src/liger_kernel/transformers/model/qwen3_5_moe.py

src/liger_kernel/transformers/monkey_patch.py

michaelroyzen · 2026-02-27T18:42:19Z

Confirming Qwen3-Next still passes

michaelroyzen · 2026-02-27T18:45:54Z

Are we ready to merge @Tcc0403 @Mecoli1219?

Mecoli1219

@michaelroyzen This looks great! Thanks for the contribution. Could you please rebase with the main branch and run make checkstyle to ensure the formatting is consistent? Let's get this merged once the build is green!

…d of Llama-style

… bf16 test matching Qwen3 MoE tolerances

michaelroyzen · 2026-03-02T18:08:43Z

Thanks, just rebased and ran make checkstyle @Mecoli1219

## Summary This PR fixes the `lce_forward` function for qwen3_5_moe model, adding support for `mm_token_type_ids` optional parameter related to multimodal processing. Follow-up to: - #1120 - #1109 This fixes a ValueError in `model.generate()` with transformers > 5.2.0, after they merged: - huggingface/transformers#43972 See related issue downstream in TRL: - huggingface/trl#5216 - huggingface/trl#5201  ## Testing Done   - Hardware Type: <BLANK> - [ ] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [ ] run `make test-convergence` to ensure convergence

michaelroyzen mentioned this pull request Feb 26, 2026

Add Qwen3.5 MoE #1110

Closed

Tcc0403 reviewed Feb 27, 2026

View reviewed changes

src/liger_kernel/transformers/model/qwen3_5_moe.py Outdated Show resolved Hide resolved

src/liger_kernel/transformers/monkey_patch.py Show resolved Hide resolved

Mecoli1219 reviewed Mar 1, 2026

View reviewed changes

Michael Royzen and others added 7 commits March 2, 2026 13:05

Add support for Qwen3.5 MoE

93dc123

Both Qwen3.5 MoE and Qwen3-Next should use Gemma-style RMSNorm instea…

2d0f1bc

…d of Llama-style

Convergence test fixes

e2b0666

Fix test imports

849b2f8

Add shift_labels to loss_function calls

a017ed4

Match fp32 skip behavior for Qwen3.5 MoE (as with Qwen3-Next) and add…

4ef9808

… bf16 test matching Qwen3 MoE tolerances

Rebase and lint

7a83092

michaelroyzen force-pushed the add-qwen3_5_moe branch from 390cd7d to 7a83092 Compare March 2, 2026 18:07

Mecoli1219 approved these changes Mar 2, 2026

View reviewed changes

Mecoli1219 added this pull request to the merge queue Mar 2, 2026

Merged via the queue into linkedin:main with commit 9983acb Mar 2, 2026
5 of 7 checks passed

wyt2000 mentioned this pull request Mar 6, 2026

Support Liger Kernel for Qwen3.5 hiyouga/LlamaFactory#10253

Open

1 task

vvvdwbvvv mentioned this pull request Mar 7, 2026

Qwen3.5moe #1115

Closed

albertvillanova mentioned this pull request Mar 10, 2026

Add mm_token_type_ids to qwen3_5_moe lce_forward #1140

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Qwen3.5 MoE#1109

Add support for Qwen3.5 MoE#1109
Mecoli1219 merged 7 commits intolinkedin:mainfrom
michaelroyzen:add-qwen3_5_moe

michaelroyzen commented Feb 26, 2026 •

edited

Loading

Uh oh!

michaelroyzen commented Feb 26, 2026

Uh oh!

michaelroyzen commented Feb 26, 2026 •

edited

Loading

Uh oh!

Tcc0403 left a comment

Uh oh!

Uh oh!

Uh oh!

michaelroyzen commented Feb 27, 2026 •

edited

Loading

Uh oh!

michaelroyzen commented Feb 27, 2026

Uh oh!

Mecoli1219 left a comment

Uh oh!

michaelroyzen commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

michaelroyzen commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add Qwen3.5 MoE support to Liger Kernel

Summary

Changes

Test plan

Uh oh!

michaelroyzen commented Feb 26, 2026

Uh oh!

michaelroyzen commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Tcc0403 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

michaelroyzen commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

michaelroyzen commented Feb 27, 2026

Uh oh!

Mecoli1219 left a comment

Choose a reason for hiding this comment

Uh oh!

michaelroyzen commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

michaelroyzen commented Feb 26, 2026 •

edited

Loading

michaelroyzen commented Feb 26, 2026 •

edited

Loading

michaelroyzen commented Feb 27, 2026 •

edited

Loading