Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| @@ -0,0 +1,32 @@ | |||
| from __future__ import annotations | |||
There was a problem hiding this comment.
we will optimize the platform plugin system next month
|
@wangxiyuan It's better to have a final look. |
|
@ywang96 And pls. |
|
Online: |
|
|
| from vllm.v1.core.sched.scheduler import Request, RequestStatus, SchedulerOutput, SpecDecodingStats | ||
| from vllm.v1.core.sched.utils import remove_all | ||
| from vllm.v1.engine import EngineCoreEventType, EngineCoreOutput | ||
| from vllm.v1.engine import EngineCoreEventType, EngineCoreOutput, EngineCoreOutputs |
There was a problem hiding this comment.
This commit also fix the common bug both on GPU and NPU. We should import EngineCoreOutputs from vllm.v1.engine to make sure the patch work.
|
I think that this PR is ready now. There is a legacy issue that batch size can't be set more than 1. We will fix it in the next PR. |
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
* fix EngineCoreOutput import path * fix the pooleroutput in NPUModelRunner * limit the batch size to 1 in qwen2.5-omni.yaml of npu Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Gaohan123
left a comment
There was a problem hiding this comment.
LGTM. Thanks for the great work!
Signed-off-by: gcanlin <canlinguosdu@gmail.com>


Co-authored-by: AndyZhou952 jzhoubc@connect.ust.hk
Co-authored-by: MengqingCao cmq0113@163.com
Purpose
Because plugin support is not yet available in vllm-omni, we are temporarily merging the NPU ModelRunner into the codebase. This implementation will be removed later and replaced with a plugin-based integration once the plugin system is supported.
Test Plan
Install vllm-omni on vllm-ascend v0.11.0rc2 image:
Then use the example to
test Qwen/Qwen2.5-Omni-7Bon NPU:And test
Qwen/Qwen-Image:Test Result
Qwen2.5-Omni-7B outputs the audio successfully(For showing it on GitHub, I have to convert wav to mp4 using ffmpeg):
output.mp4
Qwen-Image output the coffee picture:
Performance at first time(Maybe stale):
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.