[Refactor] Separate `_prepare_inputs` to `_prepare_inputs` and `_preprocess` by gcanlin · Pull Request #5973 · vllm-project/vllm-ascend

gcanlin · 2026-01-17T12:58:25Z

What this PR does / why we need it?

Part of RFC #5449.

Align with upstream vLLM. This PR will help downstream vLLM-Omni reduce the cost for maintaining the _prepare_inputs. Besides, it helps vLLM-Ascend code more readable. In the future, we can follow closer to vLLM.

Moved the multimodal, prompt-embed, positions, PP handling and update_cos_sin into _preprocess, and trimmed _prepare_inputs to return only metadata plus logits and spec-decode inputs.
Updated execute_model to call _prepare_inputs then _preprocess, preserving the original ordering while separating concerns.
Reuse _prepare_mm_inputs in vLLM and add model_kwargs.

NOTE: This PR includes #5971 changes. We need to wait it merged(if it would be approved). Then rebase this PR.

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.13.0
vLLM main: vllm-project/vllm@2c24bc6

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

github-actions · 2026-01-17T12:58:40Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request refactors the _prepare_inputs method in NPUModelRunner to better align with the upstream vLLM implementation by creating a new _preprocess method. Logic for multimodal inputs, prompt embeddings, positions, and pipeline parallelism has been moved from _prepare_inputs to _preprocess. Consequently, execute_model is updated to call these two methods sequentially, which improves separation of concerns while preserving the original execution order. The refactoring is clean, simplifies data flow by removing maybe_padded_num_tokens, and updates method signatures consistently. This is a solid improvement for maintainability and alignment with upstream. I have not found any issues with the changes.

gcanlin · 2026-01-17T13:09:20Z

@wangxiyuan Could you please help add the ready tag for e2e-full test? I want to run it in the free weekend to avoid resource queuing. Thanks!

gcanlin · 2026-01-19T10:45:55Z

@zhenwenqi2024 Hi! Could you please help review this PR? Thx!

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

gcanlin · 2026-01-20T08:07:18Z

+            if vllm_version_is('0.13.0'):
+                model_kwargs = {
+                    **self._init_model_kwargs(num_input_tokens),
+                    **self._extract_mm_kwargs(scheduler_output),


Qwen-Omni needs mm_kwargs

github-actions · 2026-01-23T01:47:19Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

gcanlin added 2 commits January 17, 2026 11:16

[Refactor] clean up maybe_padded_num_tokens

488f1f8

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

[Refacotr] Separate _prepare_inputs to _prepare_inputs and _preprocess

2d947d5

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

gcanlin requested a review from MengqingCao as a code owner January 17, 2026 12:58

gcanlin changed the title ~~Prepare~~ [Refactor] Separate _prepare_inputs to _prepare_inputs and _preprocess Jan 17, 2026

gemini-code-assist Bot reviewed Jan 17, 2026

View reviewed changes

Yikun added ready read for review ready-for-test start test by label for PR labels Jan 17, 2026

Add _prepare_mm_inputs and model_kwargs

2ca7148

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

gcanlin commented Jan 20, 2026

View reviewed changes

Comment thread vllm_ascend/worker/model_runner_v1.py

gcanlin commented Jan 20, 2026

View reviewed changes

github-actions Bot added the merge-conflicts label Jan 23, 2026

gcanlin closed this Jan 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Refactor] Separate `_prepare_inputs` to `_prepare_inputs` and `_preprocess`#5973

[Refactor] Separate `_prepare_inputs` to `_prepare_inputs` and `_preprocess`#5973
gcanlin wants to merge 3 commits intovllm-project:mainfrom
gcanlin:prepare

gcanlin commented Jan 17, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jan 17, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gcanlin commented Jan 17, 2026

Uh oh!

gcanlin commented Jan 19, 2026

Uh oh!

Uh oh!

gcanlin Jan 20, 2026

Uh oh!

github-actions Bot commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gcanlin commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions Bot commented Jan 17, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gcanlin commented Jan 17, 2026

Uh oh!

gcanlin commented Jan 19, 2026

Uh oh!

Uh oh!

gcanlin Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gcanlin commented Jan 17, 2026 •

edited

Loading