[MODEL] Adding Support for Qwen3.5 Models by JJJYmmm · Pull Request #34110 · vllm-project/vllm

JJJYmmm · 2026-02-09T03:16:23Z

Purpose

This PR adds model support for the upcoming Qwen3.5 models, including both dense and MoE variants.

🫡 Many thanks to @wulipc and @sighingnow for model verification and review, and to @ywang96 and @Isotr0py from the vLLM team for their collaboration!

Reference HF implementation - huggingface/transformers#43830

Co-authored-by: wulipc <wulipc@users.noreply.github.com> Co-authored-by: ywang96 <ywang96@users.noreply.github.com> Co-authored-by: Isotr0py <Isotr0py@users.noreply.github.com>

mergify · 2026-02-09T03:20:18Z

Hi @JJJYmmm, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Signed-off-by: JJJYmmm <1650675829@qq.com>

gemini-code-assist

Code Review

This pull request adds support for the Qwen3.5 series of models, including dense, MoE, and MTP variants. The changes are comprehensive, touching model configuration, definitions, and registries. The implementation largely follows existing patterns for adding new models. However, I've identified a significant code duplication issue in the __init__ method for the multimodal MoE model (Qwen3_5MoeForConditionalGeneration), which could lead to maintenance problems and bugs. I've suggested a refactoring to address this.

I am having trouble creating individual review comments. Click here to see my feedback.

vllm/model_executor/models/qwen3_5.py (966-1000)

The __init__ method of Qwen3_5MoeForConditionalGeneration is almost a complete copy of Qwen3_5ForConditionalGeneration.__init__. This code duplication is problematic for maintainability, as changes in the base class __init__ will not be reflected here, potentially leading to bugs.

Since Qwen3_5MoeForConditionalGeneration inherits from Qwen3_5ForConditionalGeneration, it should ideally call super().__init__(). However, this is not possible because the base class __init__ hardcodes the language model to the non-MoE version.

To resolve this, consider refactoring to reduce code duplication. One approach is to introduce a shared private _init method or a common base class that accepts the language model class as a parameter. This would make the relationship between the dense and MoE versions clearer and more robust.

vllm/model_executor/models/qwen3_next.py

vllm/model_executor/models/registry.py

vllm/model_executor/models/qwen3_next.py

Co-authored-by: Isotr0py <2037008807@qq.com> Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com>

Signed-off-by: JJJYmmm <1650675829@qq.com>

mergify · 2026-02-09T04:39:02Z

Documentation preview: https://vllm--34110.org.readthedocs.build/en/34110/

vllm/model_executor/models/qwen3_5.py

ywang96

Thanks for the great contribution! Looking forward to the model release!

DarkLight1337 · 2026-02-09T09:24:07Z

Need to update test registry

Signed-off-by: Roger Wang <hey@rogerw.io>

vadiklyutiy · 2026-02-09T13:31:11Z

How we can run it before official weights released?

Isotr0py · 2026-02-09T14:04:14Z

How we can run it before official weights released?

We have verified the accuracy on preview checkpoints. Let's trigger the nightly wheel building pipeline, so that user can run the model out-of-box immediately when official weights release.

vadiklyutiy · 2026-02-09T14:56:05Z

How we can run it before official weights released?

We have verified the accuracy on preview checkpoints. Let's trigger the nightly wheel building pipeline, so that user can run the model out-of-box immediately when official weights release.

I meant how to run with dummy weights to check performance for example

ywang96 · 2026-02-09T18:44:27Z

How we can run it before official weights released?

We have verified the accuracy on preview checkpoints. Let's trigger the nightly wheel building pipeline, so that user can run the model out-of-box immediately when official weights release.

I meant how to run with dummy weights to check performance for example

Since the model definition itself is also part of the release, perhaps you can create a dummy model config yourself and test it with --load-format dummy

vadiklyutiy · 2026-02-09T20:00:58Z

How we can run it before official weights released?

We have verified the accuracy on preview checkpoints. Let's trigger the nightly wheel building pipeline, so that user can run the model out-of-box immediately when official weights release.

I meant how to run with dummy weights to check performance for example

Since the model definition itself is also part of the release, perhaps you can create a dummy model config yourself and test it with --load-format dummy

Yeah, what I actually asked is a config. Didn't know that model config is also will be release on model launch date. Thx for clarification.

vllm-project#34110 missing changes in vllm/transformers_utils/model_arch_config_convertor.py vllm/v1/spec_decode/eagle.py Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>

Signed-off-by: JJJYmmm <1650675829@qq.com> Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: wulipc <wulipc@users.noreply.github.com> Co-authored-by: ywang96 <ywang96@users.noreply.github.com> Co-authored-by: Isotr0py <Isotr0py@users.noreply.github.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Roger Wang <hey@rogerw.io>

support Qwen3.5 series

9a4f2b9

Co-authored-by: wulipc <wulipc@users.noreply.github.com> Co-authored-by: ywang96 <ywang96@users.noreply.github.com> Co-authored-by: Isotr0py <Isotr0py@users.noreply.github.com>

JJJYmmm requested review from ProExpertProg, WoosukKwon, benchislett, hmellor, houseroad, luccafong, mgoin, robertgshaw2-redhat, sighingnow, tdoublep, tlrmchlsmth, yewentao256 and youkaichao as code owners February 9, 2026 03:16

mergify bot added new-model Requests to new models qwen Related to Qwen models speculative-decoding v1 labels Feb 9, 2026

ywang96 assigned ywang96 and Isotr0py Feb 9, 2026

pre-commit

4ac8f70

Signed-off-by: JJJYmmm <1650675829@qq.com>

gemini-code-assist bot reviewed Feb 9, 2026

View reviewed changes

Isotr0py reviewed Feb 9, 2026

View reviewed changes

vllm/model_executor/models/qwen3_next.py Outdated Show resolved Hide resolved

vllm/model_executor/models/registry.py Show resolved Hide resolved

vllm/model_executor/models/qwen3_next.py Outdated Show resolved Hide resolved

JJJYmmm and others added 3 commits February 9, 2026 11:54

Update vllm/model_executor/models/qwen3_next.py

c03cddd

Co-authored-by: Isotr0py <2037008807@qq.com> Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com>

Update vllm/model_executor/models/qwen3_next.py

ba4ff3f

Co-authored-by: Isotr0py <2037008807@qq.com> Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com>

update model card

dfbae82

Signed-off-by: JJJYmmm <1650675829@qq.com>

JJJYmmm requested review from DarkLight1337 and ywang96 as code owners February 9, 2026 04:38

mergify bot added the documentation Improvements or additions to documentation label Feb 9, 2026

ICENacl mentioned this pull request Feb 9, 2026

[Feature] Support Qwen3.5 sgl-project/sglang#18465

Open

2 tasks

ywang96 reviewed Feb 9, 2026

View reviewed changes

vllm/model_executor/models/qwen3_5.py Outdated Show resolved Hide resolved

JJJYmmm and others added 2 commits February 9, 2026 16:27

code clean

213e2e7

Merge branch 'main' into add_qwen35

a6b754d

ywang96 added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 9, 2026

ywang96 approved these changes Feb 9, 2026

View reviewed changes

Isotr0py approved these changes Feb 9, 2026

View reviewed changes

ywang96 added 2 commits February 9, 2026 01:39

add to test

c113d46

Signed-off-by: Roger Wang <hey@rogerw.io>

Merge branch 'main' into add_qwen35

0b970b9

DarkLight1337 mentioned this pull request Feb 9, 2026

[Model] Add Qwen3.5 hybrid model support #34131

Closed

5 tasks

mudler mentioned this pull request Feb 9, 2026

Adding Support for Qwen3.5 mudler/LocalAI#8469

Closed

Isotr0py merged commit 9562912 into vllm-project:main Feb 9, 2026
59 checks passed

huangye123 mentioned this pull request Mar 9, 2026

We hope to add Docker support for the vllm0.17 inference backend. gpustack/gpustack#4826

Closed

ChuanLi1101 mentioned this pull request Mar 19, 2026

[MODEL] Cherry-pick: Adding Support for Qwen3.5 Models #37514

Open

Uh oh!

Conversation

JJJYmmm commented Feb 9, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Uh oh!

mergify bot commented Feb 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

vllm/model_executor/models/qwen3_5.py (966-1000)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Feb 9, 2026

Uh oh!

Uh oh!

ywang96 left a comment

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 commented Feb 9, 2026

Uh oh!

Uh oh!

vadiklyutiy commented Feb 9, 2026

Uh oh!

Isotr0py commented Feb 9, 2026

Uh oh!

vadiklyutiy commented Feb 9, 2026

Uh oh!

ywang96 commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vadiklyutiy commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

JJJYmmm commented Feb 9, 2026 •

edited by github-actions bot

Loading

ywang96 commented Feb 9, 2026 •

edited

Loading