Skip to content

[Test] Add initial multi modal cases of Qwen2.5-VL-7B-Instruct for disaggregated encoder #5301

Merged
wangxiyuan merged 16 commits intovllm-project:mainfrom
yenuo26:open
Feb 6, 2026
Merged

[Test] Add initial multi modal cases of Qwen2.5-VL-7B-Instruct for disaggregated encoder #5301
wangxiyuan merged 16 commits intovllm-project:mainfrom
yenuo26:open

Conversation

@yenuo26
Copy link
Contributor

@yenuo26 yenuo26 commented Dec 23, 2025

What this PR does / why we need it?

This PR adds disaggregated encoder tests for Qwen2.5-VL-7B-Instruct

Does this PR introduce any user-facing change?

No

How was this patch tested?

by running the test
by running ci

Test Result

test case:
local:
image
ci:
image

examples:
image

  • vLLM version: release/v0.12.0

wangyu31577 added 5 commits December 23, 2025 20:09
Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR adds a disaggregated encoder proxy and corresponding e2e tests for Qwen2.5-VL. The changes are well-structured, introducing a proxy server, test helpers in conftest.py, and a new test case. The implementation is mostly solid, but I've found a few issues related to correctness and best practices. Specifically, the health check endpoint in the proxy has a bug, a test helper function uses blocking I/O in an async context, another helper class is not robust to its declared input types, and the test case construction of JSON strings could be improved for safety and readability. My detailed comments and suggestions are below.

Comment on lines +295 to +300
def __init__(self,
proxy_args: Union[list[str], str] = None,
env_dict: Optional[dict[str, str]] = None) -> None:
self.proxy_args = proxy_args
self.env_dict = env_dict
self._proc_list = list()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The __init__ method of DisaggEpdProxy accepts proxy_args as Union[list[str], str], but it doesn't handle the str case. If a string is passed, _start_disagg_proxy will incorrectly unpack it character by character, and __aenter__ will fail on self.proxy_args.index("--port"). For robustness and consistency with RemoteOpenAIServer, you should handle the string case by splitting it into a list of arguments using shlex.split.

Suggested change
def __init__(self,
proxy_args: Union[list[str], str] = None,
env_dict: Optional[dict[str, str]] = None) -> None:
self.proxy_args = proxy_args
self.env_dict = env_dict
self._proc_list = list()
def __init__(self,
proxy_args: Union[list[str], str] = None,
env_dict: Optional[dict[str, str]] = None) -> None:
if isinstance(proxy_args, str):
self.proxy_args = shlex.split(proxy_args)
else:
self.proxy_args = proxy_args
self.env_dict = env_dict
self._proc_list = list()

Comment on lines +78 to +80
'{"ec_connector_extra_config":{"shared_storage_path":"' +
SHARED_STORAGE_PATH +
'"},"ec_connector":"ECSharedStorageConnector","ec_role": "ec_producer"}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Constructing JSON strings via concatenation is error-prone and hard to read. It's better to define the structure as a Python dictionary and use json.dumps() to create a valid JSON string. This improves readability and correctness. Please also remember to import json at the top of the file.

        json.dumps({
            "ec_connector_extra_config": {
                "shared_storage_path": SHARED_STORAGE_PATH
            },
            "ec_connector": "ECSharedStorageConnector",
            "ec_role": "ec_producer"
        })

Comment on lines +89 to +91
'{"ec_connector_extra_config":{"shared_storage_path":"' +
SHARED_STORAGE_PATH +
'"},"ec_connector":"ECSharedStorageConnector","ec_role": "ec_consumer"}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Constructing JSON strings via concatenation is error-prone and hard to read. It's better to define the structure as a Python dictionary and use json.dumps() to create a valid JSON string. This improves readability and correctness. Please also remember to import json at the top of the file.

        json.dumps({
            "ec_connector_extra_config": {
                "shared_storage_path": SHARED_STORAGE_PATH
            },
            "ec_connector": "ECSharedStorageConnector",
            "ec_role": "ec_consumer"
        })

@github-actions
Copy link
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@@ -0,0 +1,629 @@
#!/usr/bin/env python3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


def _start_vllm_serve(self):
self.env_dict['VLLM_ALLOW_LONG_MAX_MODEL_LEN'] = "1"
self.env_dict['VLLM_USE_V1'] = "1"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.env_dict['VLLM_USE_V1'] = "1"

@wangxiyuan
Copy link
Collaborator

Any progress? If this PR is still alive, please rebase to main and make CI happy, otherwise you can close it. Thanks

@yenuo26 yenuo26 requested a review from wangxiyuan as a code owner January 20, 2026 12:01
Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
@github-actions
Copy link
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
@yenuo26 yenuo26 requested a review from Yikun as a code owner January 20, 2026 13:17
Signed-off-by: wangyu <53896905+yenuo26@users.noreply.github.com>
wangyu31577 added 2 commits January 27, 2026 09:27
Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
--max-num-batched-tokens 114688 \
--max-num-seqs 128 \
--ec-transfer-config '{
"ec_connector": "ECSharedStorageConnector",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

适配vllm新版本

yenuo26 and others added 2 commits February 2, 2026 23:18
Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
@weijinqian0 weijinqian0 added ready read for review ready-for-test start test by label for PR labels Feb 5, 2026
@wangxiyuan wangxiyuan merged commit c63b7a1 into vllm-project:main Feb 6, 2026
17 checks passed
845473182 pushed a commit to 845473182/vllm-ascend that referenced this pull request Feb 9, 2026
…to qwen3next_rebase

* 'main' of https://github.com/vllm-project/vllm-ascend:
  [Patch] Remove the patch of MiniCPM (vllm-project#5975)
  [P/D] layerwise connector support recompute scheduler (vllm-project#5900)
  [CI] Add workflow support for lint image build (vllm-project#6489)
  [Bugfix] Fix problematic dummy_run & improper input_batch_size in eagle (vllm-project#6517)
  [Refactor]310p_e2e test case update (vllm-project#6539)
  [Refactor]refactor p2p connector (vllm-project#6551)
  [Refactor]refactor 310p attention impl and add ut (vllm-project#6579)
  [Refactor]refactor 310p ops and add ut (vllm-project#6591)
  [Ops][Refactor] Remove custom rotary_embedding operator (vllm-project#6523)
  [Lint]Style: Convert `vllm-ascend/` to ruff format(new Batch vllm-project#8) (vllm-project#6604)
  [Test] Add initial multi modal cases of Qwen2.5-VL-7B-Instruct for disaggregated encoder  (vllm-project#5301)
  [CI] Fix broken CI (vllm-project#6599)
  [Lint]Style: Convert `vllm-ascend/` to ruff format(Batch vllm-project#10) (vllm-project#6173)
  [Lint]Style: Convert `vllm-ascend/` to ruff format(Batch vllm-project#11) (vllm-project#6176)
  [Lint]Style: Convert `vllm-ascend/` to ruff format(Batch vllm-project#8) (vllm-project#6129)
  [Lint]Style: Convert `vllm-ascend/` to ruff format(Batch vllm-project#7) (vllm-project#6023)
  [CI][Misc] Some improvement for github action (vllm-project#6587)
  [Image] Bump mooncake version to v0.3.8.post1 (vllm-project#6428)
@MrZ20
Copy link
Contributor

MrZ20 commented Feb 9, 2026

Hi! The nighltly test failed; there seem to be some errors that you need to resolve.
https://github.com/vllm-project/vllm-ascend/actions/runs/21801287662/job/62907968683

@yenuo26
Copy link
Contributor Author

yenuo26 commented Feb 9, 2026 via email

chenchuw886 pushed a commit to chenchuw886/vllm-ascend that referenced this pull request Feb 12, 2026
…saggregated encoder (vllm-project#5301)

### What this PR does / why we need it?
This PR adds disaggregated encoder  tests for Qwen2.5-VL-7B-Instruct
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
by running the test
by running ci

- vLLM version: release/v0.12.0

---------

Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
Signed-off-by: wangyu <53896905+yenuo26@users.noreply.github.com>
Co-authored-by: wangyu31577 <wangyu31577@hundsun.com>
Signed-off-by: momochenchuw <chenchuw@huawei.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
…saggregated encoder (vllm-project#5301)

### What this PR does / why we need it?
This PR adds disaggregated encoder  tests for Qwen2.5-VL-7B-Instruct
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
by running the test
by running ci

- vLLM version: release/v0.12.0

---------

Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
Signed-off-by: wangyu <53896905+yenuo26@users.noreply.github.com>
Co-authored-by: wangyu31577 <wangyu31577@hundsun.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
…saggregated encoder (vllm-project#5301)

### What this PR does / why we need it?
This PR adds disaggregated encoder  tests for Qwen2.5-VL-7B-Instruct 
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
by running the test
by running ci

- vLLM version: release/v0.12.0

---------

Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
Signed-off-by: wangyu <53896905+yenuo26@users.noreply.github.com>
Co-authored-by: wangyu31577 <wangyu31577@hundsun.com>
wangxiyuan pushed a commit that referenced this pull request Mar 3, 2026
… to `.yaml` (#6503)

### What this PR does / why we need it?
This PR refactors the nightly single-node model test by migrating test
configurations from Python scripts to a more maintainable `YAML-based`
format.

| Original PR | Python (`.py`) | YAML (`.yaml`) |
| :--- | :--- | :--- |
| [#3568](#3568) |
`test_deepseek_r1_0528_w8a8_eplb.py` | `DeepSeek-R1-0528-W8A8.yaml` |
| [#3631](#3631) |
`test_deepseek_r1_0528_w8a8.py` | `DeepSeek-R1-0528-W8A8.yaml` |
| [#5874](#5874) |
`test_deepseek_r1_w8a8_hbm.py` | `DeepSeek-R1-W8A8-HBM.yaml` |
| [#3908](#3908) |
`test_deepseek_v3_2_w8a8.py` | `DeepSeek-V3.2-W8A8.yaml` |
| [#5682](#5682) |
`test_kimi_k2_thinking.py` | `Kimi-K2-Thinking.yaml` |
| [#4111](#4111) |
`test_mtpx_deepseek_r1_0528_w8a8.py` | `MTPX-DeepSeek-R1-0528-W8A8.yaml`
|
| [#3733](#3733) |
`test_prefix_cache_deepseek_r1_0528_w8a8.py` |
`Prefix-Cache-DeepSeek-R1-0528-W8A8.yaml` |
| [#6543](#6543) |
`test_qwen3_235b_w8a8.py` | `Qwen3-235B-A22B-W8A8.yaml` |
| [#6543](#6543) |
`test_qwen3_235b_a22b_w8a8_eplb.py` | `Qwen3-235B-A22B-W8A8.yaml` |
| [#3973](#3973) |
`test_qwen3_30b_w8a8.py` | `Qwen3-30B-A3B-W8A8.yaml` |
| [#3541](#3541) |
`test_qwen3_32b_int8.py` | `Qwen3-32B-Int8.yaml` |
| [#3757](#3757) |
`test_qwq_32b.py` | `QwQ-32B.yaml` |
| [#5616](#5616) |
`test_qwen3_next_w8a8.py` | `Qwen3-Next-80B-A3B-Instruct-W8A8.yaml` |
| [#3541](#3541) |
`test_qwen2_5_vl_7b.py` | `Qwen2.5-VL-7B-Instruct.yaml` |
| [#5301](#5301) |
`test_qwen2_5_vl_7b_epd.py` | `Qwen2.5-VL-7B-Instruct-EPD.yaml` |
| [#3707](#3707) |
`test_qwen2_5_vl_32b.py` | `Qwen2.5-VL-32B-Instruct.yaml` |
| [#3676](#3676) |
`test_qwen3_32b_int8_a3_feature_stack3.py` |
`Qwen3-32B-Int8-A3-Feature-Stack3.yaml` |
| [#3709](#3709) |
`test_prefix_cache_qwen3_32b_int8.py` |
`Prefix-Cache-Qwen3-32B-Int8.yaml` |
| [#5395](#5395) |
`test_qwen3_next.py` | `Qwen3-Next-80B-A3B-Instruct-A2.yaml` |
| [#3474](#3474) |
`test_qwen3_32b.py` | `Qwen3-32B.yaml` |
| [#3541](#3541) |
`test_qwen3_32b_int8.py` | `Qwen3-32B-Int8-A2.yaml` |
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.15.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0

---------

Signed-off-by: MrZ20 <2609716663@qq.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
…saggregated encoder (vllm-project#5301)

### What this PR does / why we need it?
This PR adds disaggregated encoder  tests for Qwen2.5-VL-7B-Instruct
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
by running the test
by running ci

- vLLM version: release/v0.12.0

---------

Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
Signed-off-by: wangyu <53896905+yenuo26@users.noreply.github.com>
Co-authored-by: wangyu31577 <wangyu31577@hundsun.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
…saggregated encoder (vllm-project#5301)

### What this PR does / why we need it?
This PR adds disaggregated encoder  tests for Qwen2.5-VL-7B-Instruct 
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
by running the test
by running ci

- vLLM version: release/v0.12.0

---------

Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
Signed-off-by: wangyu <53896905+yenuo26@users.noreply.github.com>
Co-authored-by: wangyu31577 <wangyu31577@hundsun.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
… to `.yaml` (vllm-project#6503)

### What this PR does / why we need it?
This PR refactors the nightly single-node model test by migrating test
configurations from Python scripts to a more maintainable `YAML-based`
format.

| Original PR | Python (`.py`) | YAML (`.yaml`) |
| :--- | :--- | :--- |
| [vllm-project#3568](vllm-project#3568) |
`test_deepseek_r1_0528_w8a8_eplb.py` | `DeepSeek-R1-0528-W8A8.yaml` |
| [vllm-project#3631](vllm-project#3631) |
`test_deepseek_r1_0528_w8a8.py` | `DeepSeek-R1-0528-W8A8.yaml` |
| [vllm-project#5874](vllm-project#5874) |
`test_deepseek_r1_w8a8_hbm.py` | `DeepSeek-R1-W8A8-HBM.yaml` |
| [vllm-project#3908](vllm-project#3908) |
`test_deepseek_v3_2_w8a8.py` | `DeepSeek-V3.2-W8A8.yaml` |
| [vllm-project#5682](vllm-project#5682) |
`test_kimi_k2_thinking.py` | `Kimi-K2-Thinking.yaml` |
| [vllm-project#4111](vllm-project#4111) |
`test_mtpx_deepseek_r1_0528_w8a8.py` | `MTPX-DeepSeek-R1-0528-W8A8.yaml`
|
| [vllm-project#3733](vllm-project#3733) |
`test_prefix_cache_deepseek_r1_0528_w8a8.py` |
`Prefix-Cache-DeepSeek-R1-0528-W8A8.yaml` |
| [vllm-project#6543](vllm-project#6543) |
`test_qwen3_235b_w8a8.py` | `Qwen3-235B-A22B-W8A8.yaml` |
| [vllm-project#6543](vllm-project#6543) |
`test_qwen3_235b_a22b_w8a8_eplb.py` | `Qwen3-235B-A22B-W8A8.yaml` |
| [vllm-project#3973](vllm-project#3973) |
`test_qwen3_30b_w8a8.py` | `Qwen3-30B-A3B-W8A8.yaml` |
| [vllm-project#3541](vllm-project#3541) |
`test_qwen3_32b_int8.py` | `Qwen3-32B-Int8.yaml` |
| [vllm-project#3757](vllm-project#3757) |
`test_qwq_32b.py` | `QwQ-32B.yaml` |
| [vllm-project#5616](vllm-project#5616) |
`test_qwen3_next_w8a8.py` | `Qwen3-Next-80B-A3B-Instruct-W8A8.yaml` |
| [vllm-project#3541](vllm-project#3541) |
`test_qwen2_5_vl_7b.py` | `Qwen2.5-VL-7B-Instruct.yaml` |
| [vllm-project#5301](vllm-project#5301) |
`test_qwen2_5_vl_7b_epd.py` | `Qwen2.5-VL-7B-Instruct-EPD.yaml` |
| [vllm-project#3707](vllm-project#3707) |
`test_qwen2_5_vl_32b.py` | `Qwen2.5-VL-32B-Instruct.yaml` |
| [vllm-project#3676](vllm-project#3676) |
`test_qwen3_32b_int8_a3_feature_stack3.py` |
`Qwen3-32B-Int8-A3-Feature-Stack3.yaml` |
| [vllm-project#3709](vllm-project#3709) |
`test_prefix_cache_qwen3_32b_int8.py` |
`Prefix-Cache-Qwen3-32B-Int8.yaml` |
| [vllm-project#5395](vllm-project#5395) |
`test_qwen3_next.py` | `Qwen3-Next-80B-A3B-Instruct-A2.yaml` |
| [vllm-project#3474](vllm-project#3474) |
`test_qwen3_32b.py` | `Qwen3-32B.yaml` |
| [vllm-project#3541](vllm-project#3541) |
`test_qwen3_32b_int8.py` | `Qwen3-32B-Int8-A2.yaml` |
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.15.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0

---------

Signed-off-by: MrZ20 <2609716663@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:tests ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants