Skip to content

[bugfix](pcp) expand max_num_tokens for pcp pad#5478

Merged
zzzzwwjj merged 1 commit intovllm-project:mainfrom
pisceskkk:bugfix_maxtokens
Jan 4, 2026
Merged

[bugfix](pcp) expand max_num_tokens for pcp pad#5478
zzzzwwjj merged 1 commit intovllm-project:mainfrom
pisceskkk:bugfix_maxtokens

Conversation

@pisceskkk
Copy link
Copy Markdown
Contributor

@pisceskkk pisceskkk commented Dec 29, 2025

What this PR does / why we need it?

Since the PR for PCP modifications to GPUModelRunner has not yet been merged into vLLM, this PR temporarily requires adjustments to certain buffer sizes. These changes can be reverted once the original PR is merged.

Does this PR introduce any user-facing change?

No

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a temporary workaround to adjust buffer sizes for Prefill Context Parallelism (PCP) by modifying max_num_batched_tokens before calling the parent constructor. While this approach is functional, it is not robust against exceptions that may occur during initialization. My review includes a suggestion to use a try...finally block to ensure the configuration is always restored to its original state, thereby improving the code's resilience and maintainability.

Comment thread vllm_ascend/worker/model_runner_v1.py Outdated
@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@pisceskkk pisceskkk marked this pull request as ready for review December 29, 2025 09:29
@pisceskkk pisceskkk force-pushed the bugfix_maxtokens branch 3 times, most recently from c5a4d86 to 164cdb9 Compare December 30, 2025 02:31
Comment thread vllm_ascend/worker/model_runner_v1.py
@pisceskkk pisceskkk force-pushed the bugfix_maxtokens branch 2 times, most recently from db45fc8 to bbae1f8 Compare December 30, 2025 08:33
@pisceskkk pisceskkk requested a review from zzzzwwjj December 30, 2025 09:44
@github-actions
Copy link
Copy Markdown
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
@weiguihua2 weiguihua2 added ready read for review ready-for-test start test by label for PR labels Dec 31, 2025
@zzzzwwjj zzzzwwjj merged commit f15dc3f into vllm-project:main Jan 4, 2026
54 of 66 checks passed
@pisceskkk pisceskkk deleted the bugfix_maxtokens branch January 4, 2026 09:26
Rozwel-dx pushed a commit to Rozwel-dx/vllm-ascend that referenced this pull request Jan 8, 2026
### What this PR does / why we need it?
Since the [PR](vllm-project/vllm#28988) for PCP
modifications to `GPUModelRunner` has not yet been merged into vLLM,
this PR temporarily requires adjustments to certain buffer sizes. These
changes can be reverted once the original
[PR](vllm-project/vllm#28988) is merged.

### Does this PR introduce _any_ user-facing change?
No

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@5326c89

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
### What this PR does / why we need it?
Since the [PR](vllm-project/vllm#28988) for PCP
modifications to `GPUModelRunner` has not yet been merged into vLLM,
this PR temporarily requires adjustments to certain buffer sizes. These
changes can be reverted once the original
[PR](vllm-project/vllm#28988) is merged.

### Does this PR introduce _any_ user-facing change?
No

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@5326c89

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
### What this PR does / why we need it?
Since the [PR](vllm-project/vllm#28988) for PCP
modifications to `GPUModelRunner` has not yet been merged into vLLM,
this PR temporarily requires adjustments to certain buffer sizes. These
changes can be reverted once the original
[PR](vllm-project/vllm#28988) is merged.

### Does this PR introduce _any_ user-facing change?
No

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@5326c89

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
### What this PR does / why we need it?
Since the [PR](vllm-project/vllm#28988) for PCP
modifications to `GPUModelRunner` has not yet been merged into vLLM,
this PR temporarily requires adjustments to certain buffer sizes. These
changes can be reverted once the original
[PR](vllm-project/vllm#28988) is merged.

### Does this PR introduce _any_ user-facing change?
No

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@5326c89

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
### What this PR does / why we need it?
Since the [PR](vllm-project/vllm#28988) for PCP
modifications to `GPUModelRunner` has not yet been merged into vLLM,
this PR temporarily requires adjustments to certain buffer sizes. These
changes can be reverted once the original
[PR](vllm-project/vllm#28988) is merged.

### Does this PR introduce _any_ user-facing change?
No

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@5326c89

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants