[bugfix](pcp) expand max_num_tokens for pcp pad by pisceskkk · Pull Request #5478 · vllm-project/vllm-ascend

pisceskkk · 2025-12-29T08:37:03Z

What this PR does / why we need it?

Since the PR for PCP modifications to GPUModelRunner has not yet been merged into vLLM, this PR temporarily requires adjustments to certain buffer sizes. These changes can be reverted once the original PR is merged.

Does this PR introduce any user-facing change?

No

vLLM version: v0.13.0
vLLM main: vllm-project/vllm@5326c89

gemini-code-assist

Code Review

This pull request introduces a temporary workaround to adjust buffer sizes for Prefill Context Parallelism (PCP) by modifying max_num_batched_tokens before calling the parent constructor. While this approach is functional, it is not robust against exceptions that may occur during initialization. My review includes a suggestion to use a try...finally block to ensure the configuration is always restored to its original state, thereby improving the code's resilience and maintainability.

github-actions · 2025-12-29T09:18:25Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

github-actions · 2025-12-31T01:34:34Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

github-actions · 2025-12-31T01:34:35Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

### What this PR does / why we need it? Since the [PR](vllm-project/vllm#28988) for PCP modifications to `GPUModelRunner` has not yet been merged into vLLM, this PR temporarily requires adjustments to certain buffer sizes. These changes can be reverted once the original [PR](vllm-project/vllm#28988) is merged. ### Does this PR introduce _any_ user-facing change? No - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

### What this PR does / why we need it? Since the [PR](vllm-project/vllm#28988) for PCP modifications to `GPUModelRunner` has not yet been merged into vLLM, this PR temporarily requires adjustments to certain buffer sizes. These changes can be reverted once the original [PR](vllm-project/vllm#28988) is merged. ### Does this PR introduce _any_ user-facing change? No - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

### What this PR does / why we need it? Since the [PR](vllm-project/vllm#28988) for PCP modifications to `GPUModelRunner` has not yet been merged into vLLM, this PR temporarily requires adjustments to certain buffer sizes. These changes can be reverted once the original [PR](vllm-project/vllm#28988) is merged. ### Does this PR introduce _any_ user-facing change? No - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

### What this PR does / why we need it? Since the [PR](vllm-project/vllm#28988) for PCP modifications to `GPUModelRunner` has not yet been merged into vLLM, this PR temporarily requires adjustments to certain buffer sizes. These changes can be reverted once the original [PR](vllm-project/vllm#28988) is merged. ### Does this PR introduce _any_ user-facing change? No - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

### What this PR does / why we need it? Since the [PR](vllm-project/vllm#28988) for PCP modifications to `GPUModelRunner` has not yet been merged into vLLM, this PR temporarily requires adjustments to certain buffer sizes. These changes can be reverted once the original [PR](vllm-project/vllm#28988) is merged. ### Does this PR introduce _any_ user-facing change? No - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

gemini-code-assist bot reviewed Dec 29, 2025

View reviewed changes

Comment thread vllm_ascend/worker/model_runner_v1.py Outdated

pisceskkk marked this pull request as ready for review December 29, 2025 09:29

pisceskkk force-pushed the bugfix_maxtokens branch 3 times, most recently from c5a4d86 to 164cdb9 Compare December 30, 2025 02:31

zzzzwwjj reviewed Dec 30, 2025

View reviewed changes

Comment thread vllm_ascend/worker/model_runner_v1.py

pisceskkk force-pushed the bugfix_maxtokens branch 2 times, most recently from db45fc8 to bbae1f8 Compare December 30, 2025 08:33

pisceskkk requested a review from zzzzwwjj December 30, 2025 09:44

github-actions bot added the merge-conflicts label Dec 31, 2025

[bugfix](pcp) expand max_num_tokens for pcp pad

4e21ea9

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

pisceskkk force-pushed the bugfix_maxtokens branch from bbae1f8 to 4e21ea9 Compare December 31, 2025 01:52

github-actions bot removed the merge-conflicts label Dec 31, 2025

weiguihua2 added ready read for review ready-for-test start test by label for PR labels Dec 31, 2025

zzzzwwjj approved these changes Jan 4, 2026

View reviewed changes

zzzzwwjj merged commit f15dc3f into vllm-project:main Jan 4, 2026
54 of 66 checks passed

pisceskkk deleted the bugfix_maxtokens branch January 4, 2026 09:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bugfix](pcp) expand max_num_tokens for pcp pad#5478

[bugfix](pcp) expand max_num_tokens for pcp pad#5478
zzzzwwjj merged 1 commit intovllm-project:mainfrom
pisceskkk:bugfix_maxtokens

pisceskkk commented Dec 29, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Dec 29, 2025

Uh oh!

Uh oh!

github-actions bot commented Dec 31, 2025

Uh oh!

github-actions bot commented Dec 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pisceskkk commented Dec 29, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

github-actions bot commented Dec 29, 2025

Uh oh!

Uh oh!

github-actions bot commented Dec 31, 2025

Uh oh!

github-actions bot commented Dec 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pisceskkk commented Dec 29, 2025 •

edited by github-actions bot

Loading