Skip to content

[Refactor] Separate _prepare_inputs to _prepare_inputs and _preprocess#5973

Closed
gcanlin wants to merge 3 commits intovllm-project:mainfrom
gcanlin:prepare
Closed

[Refactor] Separate _prepare_inputs to _prepare_inputs and _preprocess#5973
gcanlin wants to merge 3 commits intovllm-project:mainfrom
gcanlin:prepare

Conversation

@gcanlin
Copy link
Copy Markdown
Collaborator

@gcanlin gcanlin commented Jan 17, 2026

What this PR does / why we need it?

Part of RFC #5449.

Align with upstream vLLM. This PR will help downstream vLLM-Omni reduce the cost for maintaining the _prepare_inputs. Besides, it helps vLLM-Ascend code more readable. In the future, we can follow closer to vLLM.

  • Moved the multimodal, prompt-embed, positions, PP handling and update_cos_sin into _preprocess, and trimmed _prepare_inputs to return only metadata plus logits and spec-decode inputs.
  • Updated execute_model to call _prepare_inputs then _preprocess, preserving the original ordering while separating concerns.
  • Reuse _prepare_mm_inputs in vLLM and add model_kwargs.

NOTE: This PR includes #5971 changes. We need to wait it merged(if it would be approved). Then rebase this PR.

Does this PR introduce any user-facing change?

How was this patch tested?

Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
@gcanlin gcanlin requested a review from MengqingCao as a code owner January 17, 2026 12:58
@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@gcanlin gcanlin changed the title Prepare [Refactor] Separate _prepare_inputs to _prepare_inputs and _preprocess Jan 17, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the _prepare_inputs method in NPUModelRunner to better align with the upstream vLLM implementation by creating a new _preprocess method. Logic for multimodal inputs, prompt embeddings, positions, and pipeline parallelism has been moved from _prepare_inputs to _preprocess. Consequently, execute_model is updated to call these two methods sequentially, which improves separation of concerns while preserving the original execution order. The refactoring is clean, simplifies data flow by removing maybe_padded_num_tokens, and updates method signatures consistently. This is a solid improvement for maintainability and alignment with upstream. I have not found any issues with the changes.

@gcanlin
Copy link
Copy Markdown
Collaborator Author

gcanlin commented Jan 17, 2026

@wangxiyuan Could you please help add the ready tag for e2e-full test? I want to run it in the free weekend to avoid resource queuing. Thanks!

@Yikun Yikun added ready read for review ready-for-test start test by label for PR labels Jan 17, 2026
@gcanlin
Copy link
Copy Markdown
Collaborator Author

gcanlin commented Jan 19, 2026

@zhenwenqi2024 Hi! Could you please help review this PR? Thx!

Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Comment thread vllm_ascend/worker/model_runner_v1.py
if vllm_version_is('0.13.0'):
model_kwargs = {
**self._init_model_kwargs(num_input_tokens),
**self._extract_mm_kwargs(scheduler_output),
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Qwen-Omni needs mm_kwargs

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@gcanlin gcanlin closed this Jan 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-conflicts ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants