[Doc] Add single NPU tutorial for Qwen2.5-Omni-7B#4446
[Doc] Add single NPU tutorial for Qwen2.5-Omni-7B#4446MengqingCao merged 1 commit intovllm-project:mainfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request adds a new tutorial for running Qwen2.5-Omni-7B on a single NPU. The documentation is well-structured, covering both offline inference and online serving. I've identified a missing dependency installation step that would prevent the offline inference example from running and have provided a suggestion to fix it.
|
|
||
| Run the following script to execute offline inference on a single NPU: | ||
|
|
||
|
|
There was a problem hiding this comment.
The Python script for offline inference uses qwen_vl_utils.process_vision_info, but the qwen_vl_utils package is not installed in the Docker container by default. This will cause an ImportError when running the script. Please add a step to install this package.
| pip install qwen_vl_utils --extra-index-url https://download.pytorch.org/whl/cpu/ |
598628d to
d56cd51
Compare
|
|
||
| Qwen2.5-Omni is an end-to-end multimodal model designed to perceive diverse modalities, including text, images, audio, and video, while simultaneously generating text and natural speech responses in a streaming manner. | ||
|
|
||
| This document will show the main verification steps of the model, including supported features, feature configuration, environment preparation, single-node and multi-node deployment, accuracy and performance evaluation. |
There was a problem hiding this comment.
It's better to add this model's first supported version, like "The DeepSeek-V3.1 model is first supported in vllm-ascend:v0.9.1rc3"
|
|
||
| You can using our official docker image, v0.11.0 and later version of vllm-ascend supports Qwen2.5-Omni. | ||
|
|
||
| :::{note} |
There was a problem hiding this comment.
please check this note, only aarch64 supported?
There was a problem hiding this comment.
Checked, sorry for the wrong info.
|
|
||
| In addition, if you don't want to use the docker image as above, you can also build all from source: | ||
|
|
||
| - Install `vllm-ascend` from source, refer to [installation](../installation.md). |
There was a problem hiding this comment.
Build from source i think you can just delete it. Since if one want to build from source, he must have some experiences and should check the installation page. Docker image can make the simple way for inexperienced person. And the tab code you can delete it also.
| ::::{tab-item} A3&A2 series | ||
| :sync: A3&A2 | ||
|
|
||
| Start the docker image on your node, refer to [using docker](../installation.md#set-up-using-docker). |
There was a problem hiding this comment.
provide docker run command directly, like "#4399"
Signed-off-by: Ting FU <futing10@huawei.com>
d56cd51 to
3d51f63
Compare
|
/lgtm |
### What this PR does / why we need it? Add single NPU tutorial for Qwen2.5-Omni-7B - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: Ting FU <futing10@huawei.com>
### What this PR does / why we need it? Add single NPU tutorial for Qwen2.5-Omni-7B - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: Ting FU <futing10@huawei.com> Signed-off-by: Che Ruan <cr623@ic.ac.uk>
### What this PR does / why we need it? Add single NPU tutorial for Qwen2.5-Omni-7B - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: Ting FU <futing10@huawei.com> Signed-off-by: Che Ruan <cr623@ic.ac.uk>
### What this PR does / why we need it? Add single NPU tutorial for Qwen2.5-Omni-7B - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: Ting FU <futing10@huawei.com>
### What this PR does / why we need it? Add single NPU tutorial for Qwen2.5-Omni-7B - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: Ting FU <futing10@huawei.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>
### What this PR does / why we need it? Add single NPU tutorial for Qwen2.5-Omni-7B - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: Ting FU <futing10@huawei.com>
### What this PR does / why we need it? Add single NPU tutorial for Qwen2.5-Omni-7B - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: Ting FU <futing10@huawei.com>
What this PR does / why we need it?
Add single NPU tutorial for Qwen2.5-Omni-7B
Does this PR introduce any user-facing change?
No
How was this patch tested?