[Model] Add Qwen3.5 hybrid model support by liuchenbing2026 · Pull Request #34131 · vllm-project/vllm

liuchenbing2026 · 2026-02-09T11:08:00Z

Add inference support for the Qwen3.5 hybrid architecture model, which interleaves full attention (transformer) and linear attention (Gated Delta Net) layers with dense MLP.

New files:

vllm/transformers_utils/configs/qwen3_5.py: Qwen3_5TextConfig with head_dim=256, partial_rotary_factor=0.25, and hybrid layer_types
vllm/model_executor/models/qwen3_5.py: Full model implementation including Qwen3_5GatedDeltaNet with separate input projections (in_proj_qkv, in_proj_z, in_proj_b, in_proj_a), dense MLP (no MoE), hybrid decoder layer, custom op registration, and weight loading with multimodal (language_model prefix) support

Modified files:

Register Qwen3_5TextConfig in config registry (model_type: qwen3_5_text)
Register Qwen3_5ForCausalLM in model registry

Purpose

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request introduces support for the Qwen3.5 hybrid model architecture. The implementation is well-structured, reusing components from Qwen3-Next where appropriate and adding new modules like Qwen3_5GatedDeltaNet for the model's specific characteristics. The changes include the model definition, its configuration, and registration within the vLLM framework. The code largely follows existing patterns, but I've identified one issue with an incorrect type hint that should be addressed for correctness and consistency.

gemini-code-assist · 2026-02-09T11:10:29Z

vllm/model_executor/models/qwen3_5.py

+    def get_mamba_state_shape_from_config(
+        cls, vllm_config: "VllmConfig"
+    ) -> tuple[tuple[int, int], tuple[int, int]]:


The return type hint for get_mamba_state_shape_from_config is incorrect. It is specified as tuple[tuple[int, int], tuple[int, int]], but the MambaStateShapeCalculator.gated_delta_net_state_shape function it calls returns a tuple where the second element is a 3-tuple (num_heads, head_v_dim, head_k_dim). The correct return type should be tuple[tuple[int, int], tuple[int, int, int]] to match the actual returned value and the base class IsHybrid.

Suggested change

def get_mamba_state_shape_from_config(

cls, vllm_config: "VllmConfig"

) -> tuple[tuple[int, int], tuple[int, int]]:

def get_mamba_state_shape_from_config(

cls, vllm_config: "VllmConfig"

) -> tuple[tuple[int, int], tuple[int, int, int]]:

DarkLight1337 · 2026-02-09T11:16:01Z

We already have another open PR for this: #34110

Add inference support for the Qwen3.5 hybrid architecture model. New files: - vllm/transformers_utils/configs/qwen3_5.py - vllm/model_executor/models/qwen3_5.py Modified files: - Register Qwen3_5TextConfig in config registry - Register Qwen3_5ForCausalLM in model registry

mergify · 2026-02-09T14:52:29Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @liuchen2026fly.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

DarkLight1337 · 2026-02-09T16:40:00Z

Closing as superseded by #34110

## 📌 Description Add test case for Qwen3N, and Qwen3.5 according to vllm-project/vllm#34131  ## 🔍 Related Issues  ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [x] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [x] I have installed the hooks with `pre-commit install`. - [x] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [x] Tests have been added or updated as needed. - [ ] All tests are passing (`unittest`, etc.). ## Reviewer Notes   ## Summary by CodeRabbit * **Tests** * Expanded test coverage by adding additional head-configuration cases across multiple test scenarios to improve reliability and catch more edge cases. * No changes to test logic or public interfaces; only parameterized inputs were extended.

liuchenbing2026 requested a review from sighingnow as a code owner February 9, 2026 11:08

mergify bot added new-model Requests to new models qwen Related to Qwen models labels Feb 9, 2026

gemini-code-assist bot reviewed Feb 9, 2026

View reviewed changes

liuchenbing2026 force-pushed the vllm_qwen3.5_add branch from ccf450b to c04143b Compare February 9, 2026 11:20

mergify bot added the needs-rebase label Feb 9, 2026

DarkLight1337 closed this Feb 9, 2026

samuellees mentioned this pull request Feb 10, 2026

Add test case for Qwen3N flashinfer-ai/flashinfer#2532

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Model] Add Qwen3.5 hybrid model support#34131

[Model] Add Qwen3.5 hybrid model support#34131
liuchenbing2026 wants to merge 1 commit intovllm-project:mainfrom
liuchenbing2026:vllm_qwen3.5_add

liuchenbing2026 commented Feb 9, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 9, 2026

Uh oh!

DarkLight1337 commented Feb 9, 2026

Uh oh!

mergify bot commented Feb 9, 2026

Uh oh!

DarkLight1337 commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

liuchenbing2026 commented Feb 9, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 commented Feb 9, 2026

Uh oh!

mergify bot commented Feb 9, 2026

Uh oh!

DarkLight1337 commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

liuchenbing2026 commented Feb 9, 2026 •

edited by github-actions bot

Loading