Skip to content

[MODEL] Adding Support for Qwen3.5 Models#34110

Merged
Isotr0py merged 9 commits intovllm-project:mainfrom
JJJYmmm:add_qwen35
Feb 9, 2026
Merged

[MODEL] Adding Support for Qwen3.5 Models#34110
Isotr0py merged 9 commits intovllm-project:mainfrom
JJJYmmm:add_qwen35

Conversation

@JJJYmmm
Copy link
Contributor

@JJJYmmm JJJYmmm commented Feb 9, 2026

Purpose

This PR adds model support for the upcoming Qwen3.5 models, including both dense and MoE variants.

🫡 Many thanks to @wulipc and @sighingnow for model verification and review, and to @ywang96 and @Isotr0py from the vLLM team for their collaboration!

Reference HF implementation - huggingface/transformers#43830

Co-authored-by: wulipc <wulipc@users.noreply.github.com>
Co-authored-by: ywang96 <ywang96@users.noreply.github.com>
Co-authored-by: Isotr0py <Isotr0py@users.noreply.github.com>
@mergify
Copy link

mergify bot commented Feb 9, 2026

Hi @JJJYmmm, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Signed-off-by: JJJYmmm <1650675829@qq.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the Qwen3.5 series of models, including dense, MoE, and MTP variants. The changes are comprehensive, touching model configuration, definitions, and registries. The implementation largely follows existing patterns for adding new models. However, I've identified a significant code duplication issue in the __init__ method for the multimodal MoE model (Qwen3_5MoeForConditionalGeneration), which could lead to maintenance problems and bugs. I've suggested a refactoring to address this.

I am having trouble creating individual review comments. Click here to see my feedback.

vllm/model_executor/models/qwen3_5.py (966-1000)

high

The __init__ method of Qwen3_5MoeForConditionalGeneration is almost a complete copy of Qwen3_5ForConditionalGeneration.__init__. This code duplication is problematic for maintainability, as changes in the base class __init__ will not be reflected here, potentially leading to bugs.

Since Qwen3_5MoeForConditionalGeneration inherits from Qwen3_5ForConditionalGeneration, it should ideally call super().__init__(). However, this is not possible because the base class __init__ hardcodes the language model to the non-MoE version.

To resolve this, consider refactoring to reduce code duplication. One approach is to introduce a shared private _init method or a common base class that accepts the language model class as a parameter. This would make the relationship between the dense and MoE versions clearer and more robust.

JJJYmmm and others added 3 commits February 9, 2026 11:54
Co-authored-by: Isotr0py <2037008807@qq.com>
Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com>
Signed-off-by: JJJYmmm <1650675829@qq.com>
@mergify
Copy link

mergify bot commented Feb 9, 2026

Documentation preview: https://vllm--34110.org.readthedocs.build/en/34110/

@mergify mergify bot added the documentation Improvements or additions to documentation label Feb 9, 2026
@ywang96 ywang96 added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 9, 2026
Copy link
Member

@ywang96 ywang96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great contribution! Looking forward to the model release!

@DarkLight1337
Copy link
Member

Need to update test registry

Signed-off-by: Roger Wang <hey@rogerw.io>
@Isotr0py Isotr0py merged commit 9562912 into vllm-project:main Feb 9, 2026
59 checks passed
@vadiklyutiy
Copy link
Collaborator

How we can run it before official weights released?

@Isotr0py
Copy link
Member

Isotr0py commented Feb 9, 2026

How we can run it before official weights released?

We have verified the accuracy on preview checkpoints. Let's trigger the nightly wheel building pipeline, so that user can run the model out-of-box immediately when official weights release.

@vadiklyutiy
Copy link
Collaborator

How we can run it before official weights released?

We have verified the accuracy on preview checkpoints. Let's trigger the nightly wheel building pipeline, so that user can run the model out-of-box immediately when official weights release.

I meant how to run with dummy weights to check performance for example

@ywang96
Copy link
Member

ywang96 commented Feb 9, 2026

How we can run it before official weights released?

We have verified the accuracy on preview checkpoints. Let's trigger the nightly wheel building pipeline, so that user can run the model out-of-box immediately when official weights release.

I meant how to run with dummy weights to check performance for example

Since the model definition itself is also part of the release, perhaps you can create a dummy model config yourself and test it with --load-format dummy

@vadiklyutiy
Copy link
Collaborator

How we can run it before official weights released?

We have verified the accuracy on preview checkpoints. Let's trigger the nightly wheel building pipeline, so that user can run the model out-of-box immediately when official weights release.

I meant how to run with dummy weights to check performance for example

Since the model definition itself is also part of the release, perhaps you can create a dummy model config yourself and test it with --load-format dummy

Yeah, what I actually asked is a config. Didn't know that model config is also will be release on model launch date. Thx for clarification.

wenbinc-Bin added a commit to wenbinc-Bin/vllm-fork that referenced this pull request Feb 12, 2026
vllm-project#34110
missing changes in
vllm/transformers_utils/model_arch_config_convertor.py
vllm/v1/spec_decode/eagle.py

Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
llsj14 pushed a commit to llsj14/vllm that referenced this pull request Mar 1, 2026
Signed-off-by: JJJYmmm <1650675829@qq.com>
Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: wulipc <wulipc@users.noreply.github.com>
Co-authored-by: ywang96 <ywang96@users.noreply.github.com>
Co-authored-by: Isotr0py <Isotr0py@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Mar 4, 2026
Signed-off-by: JJJYmmm <1650675829@qq.com>
Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: wulipc <wulipc@users.noreply.github.com>
Co-authored-by: ywang96 <ywang96@users.noreply.github.com>
Co-authored-by: Isotr0py <Isotr0py@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
ChuanLi1101 pushed a commit to ChuanLi1101/vllm that referenced this pull request Mar 19, 2026
Signed-off-by: JJJYmmm <1650675829@qq.com>
Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: wulipc <wulipc@users.noreply.github.com>
Co-authored-by: ywang96 <ywang96@users.noreply.github.com>
Co-authored-by: Isotr0py <Isotr0py@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
ChuanLi1101 pushed a commit to ChuanLi1101/vllm that referenced this pull request Mar 19, 2026
Signed-off-by: JJJYmmm <1650675829@qq.com>
Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: wulipc <wulipc@users.noreply.github.com>
Co-authored-by: ywang96 <ywang96@users.noreply.github.com>
Co-authored-by: Isotr0py <Isotr0py@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation new-model Requests to new models qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed speculative-decoding v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants