[MODEL] Adding Support for Qwen3.5 Models#34110
Conversation
Co-authored-by: wulipc <wulipc@users.noreply.github.com> Co-authored-by: ywang96 <ywang96@users.noreply.github.com> Co-authored-by: Isotr0py <Isotr0py@users.noreply.github.com>
|
Hi @JJJYmmm, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: JJJYmmm <1650675829@qq.com>
There was a problem hiding this comment.
Code Review
This pull request adds support for the Qwen3.5 series of models, including dense, MoE, and MTP variants. The changes are comprehensive, touching model configuration, definitions, and registries. The implementation largely follows existing patterns for adding new models. However, I've identified a significant code duplication issue in the __init__ method for the multimodal MoE model (Qwen3_5MoeForConditionalGeneration), which could lead to maintenance problems and bugs. I've suggested a refactoring to address this.
I am having trouble creating individual review comments. Click here to see my feedback.
vllm/model_executor/models/qwen3_5.py (966-1000)
The __init__ method of Qwen3_5MoeForConditionalGeneration is almost a complete copy of Qwen3_5ForConditionalGeneration.__init__. This code duplication is problematic for maintainability, as changes in the base class __init__ will not be reflected here, potentially leading to bugs.
Since Qwen3_5MoeForConditionalGeneration inherits from Qwen3_5ForConditionalGeneration, it should ideally call super().__init__(). However, this is not possible because the base class __init__ hardcodes the language model to the non-MoE version.
To resolve this, consider refactoring to reduce code duplication. One approach is to introduce a shared private _init method or a common base class that accepts the language model class as a parameter. This would make the relationship between the dense and MoE versions clearer and more robust.
Co-authored-by: Isotr0py <2037008807@qq.com> Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com> Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com>
Signed-off-by: JJJYmmm <1650675829@qq.com>
|
Documentation preview: https://vllm--34110.org.readthedocs.build/en/34110/ |
ywang96
left a comment
There was a problem hiding this comment.
Thanks for the great contribution! Looking forward to the model release!
|
Need to update test registry |
Signed-off-by: Roger Wang <hey@rogerw.io>
|
How we can run it before official weights released? |
We have verified the accuracy on preview checkpoints. Let's trigger the nightly wheel building pipeline, so that user can run the model out-of-box immediately when official weights release. |
I meant how to run with dummy weights to check performance for example |
Since the model definition itself is also part of the release, perhaps you can create a dummy model config yourself and test it with |
Yeah, what I actually asked is a config. Didn't know that model config is also will be release on model launch date. Thx for clarification. |
vllm-project#34110 missing changes in vllm/transformers_utils/model_arch_config_convertor.py vllm/v1/spec_decode/eagle.py Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: JJJYmmm <1650675829@qq.com> Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: wulipc <wulipc@users.noreply.github.com> Co-authored-by: ywang96 <ywang96@users.noreply.github.com> Co-authored-by: Isotr0py <Isotr0py@users.noreply.github.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Roger Wang <hey@rogerw.io>
Signed-off-by: JJJYmmm <1650675829@qq.com> Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: wulipc <wulipc@users.noreply.github.com> Co-authored-by: ywang96 <ywang96@users.noreply.github.com> Co-authored-by: Isotr0py <Isotr0py@users.noreply.github.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Roger Wang <hey@rogerw.io>
Signed-off-by: JJJYmmm <1650675829@qq.com> Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: wulipc <wulipc@users.noreply.github.com> Co-authored-by: ywang96 <ywang96@users.noreply.github.com> Co-authored-by: Isotr0py <Isotr0py@users.noreply.github.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Roger Wang <hey@rogerw.io>
Signed-off-by: JJJYmmm <1650675829@qq.com> Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: wulipc <wulipc@users.noreply.github.com> Co-authored-by: ywang96 <ywang96@users.noreply.github.com> Co-authored-by: Isotr0py <Isotr0py@users.noreply.github.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Roger Wang <hey@rogerw.io>
Purpose
This PR adds model support for the upcoming Qwen3.5 models, including both dense and MoE variants.
🫡 Many thanks to @wulipc and @sighingnow for model verification and review, and to @ywang96 and @Isotr0py from the vLLM team for their collaboration!
Reference HF implementation - huggingface/transformers#43830