[Patch] Remove the patch of MiniCPM by gcanlin · Pull Request #5975 · vllm-project/vllm-ascend

gcanlin · 2026-01-17T16:10:05Z

What this PR does / why we need it?

Part of #5304.

After vllm-project/vllm#32523 merge, we could remove the patch of MiniCPMAttention.

Does this PR introduce any user-facing change?

How was this patch tested?

Test it locally.

from transformers import AutoTokenizer
from vllm import LLM, SamplingParams

model_name = "openbmb/MiniCPM-2B-sft-bf16"
prompt = [{"role": "user", "content": "Write an article about Artificial Intelligence."}]

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
input_text = tokenizer.apply_chat_template(prompt, tokenize=False, add_generation_prompt=True)

llm = LLM(
    model=model_name,
    trust_remote_code=True,
    max_num_batched_tokens=65536,
    dtype="bfloat16", 
    gpu_memory_utilization=0.8, 
)
sampling_params = SamplingParams(top_p=0.95, temperature=0.6, max_tokens=32768)

outputs = llm.generate(prompts=input_text, sampling_params=sampling_params)

print(outputs[0].outputs[0].text)

INFO 01-17 15:58:34 [llm.py:347] Supported tasks: ['generate']
Adding requests: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 57.79it/s]
Processed prompts:   0%|                                            | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s](EngineCore_DP0 pid=536888) INFO 01-17 15:58:34 [acl_graph.py:188] Replaying aclgraph
Processed prompts: 100%|███████████████████████████████████| 1/1 [00:08<00:00,  8.41s/it, est. speed input: 1.66 toks/s, output: 51.36 toks/s]
 Artificial Intelligence (AI) is a field of computer science that focuses on creating machines that can perform tasks that typically require human intelligence. AI has been around for decades, but it is only in recent years that it has become more accessible and practical for everyday use.

One of the most significant developments in AI is machine learning. Machine learning is a subset of AI that focuses on teaching machines to learn from data. This means that instead of being programmed to perform a specific task, machines can learn on their own by analyzing data and making decisions based on that data.

The potential applications of AI are vast and varied. In healthcare, AI can be used to analyze medical images and help doctors diagnose diseases. In finance, AI can be used to analyze financial data and predict market trends. In transportation, AI can be used to optimize traffic flow and reduce congestion.

AI is also being used in gaming and entertainment. Games like AlphaGo have demonstrated the potential of AI to outperform humans at complex games like Go. AI is also being used in movies and TV shows to create realistic characters and environments.

While AI has many potential applications, there are also concerns about its impact on society. One concern is the potential loss of jobs as machines become more capable of performing tasks traditionally done by humans. There are also concerns about the ethical implications of AI, such as the use of facial recognition technology to identify and track individuals without their consent.

Despite these concerns, the potential benefits of AI are significant. AI has the potential to revolutionize many industries and improve our lives in countless ways. As AI continues to develop, it will be important for policymakers and industry leaders to address these concerns and ensure that AI is used in a responsible and ethical manner.

In conclusion, AI is a field of computer science that has the potential to revolutionize many industries and improve our lives in countless ways. While there are concerns about its impact on society, the potential benefits of AI are significant. As AI continues to develop, it will be important for policymakers and industry leaders to address these concerns and ensure that AI is used in a responsible and ethical manner.

vLLM version: v0.13.0
vLLM main: vllm-project/vllm@2c24bc6

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

github-actions · 2026-01-17T16:10:20Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request removes an obsolete patch for MiniCPMAttention, which is a good cleanup as it simplifies the codebase. However, the corresponding test file tests/ut/patch/worker/patch_common/test_patch_minicpm.py has not been removed. This will cause an ImportError and break the test suite. Please remove the test file as well.

I am having trouble creating individual review comments. Click here to see my feedback.

vllm_ascend/patch/worker/patch_minicpm.py (18-36)

With the removal of this patch file, the corresponding test file tests/ut/patch/worker/patch_common/test_patch_minicpm.py should also be removed. The test file imports from this patch file, and its removal without deleting the test file will lead to an ImportError, causing the CI to fail.

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

gcanlin · 2026-01-18T11:46:25Z

Waiting for main branch upgrading to 0118. Then we can merge this PR.

wangxiyuan · 2026-01-19T01:04:28Z

let's block this change until the vllm change works with vllm ascend on main

gcanlin · 2026-01-27T00:50:01Z

@wangxiyuan Ready to merge now.

wangxiyuan · 2026-01-27T00:53:08Z

no, the vllm commit vllm-project/vllm@fe36bf5 is not included in v0.14.1. Let's merge this once we upgrade to 0.15

gcanlin · 2026-01-27T00:58:42Z

no, the vllm commit vllm-project/vllm@fe36bf5 is not included in v0.14.1. Let's merge this once we upgrade to 0.15

Oh, okay. We're still keeping compatibility with 0.14.1.

github-actions · 2026-01-29T06:48:01Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

wangxiyuan · 2026-02-09T02:05:44Z

this PR can be rebased and merged now. @gcanlin

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

gcanlin · 2026-02-09T02:46:31Z

this PR can be rebased and merged now. @gcanlin

Done.

…to qwen3next_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: [Patch] Remove the patch of MiniCPM (vllm-project#5975) [P/D] layerwise connector support recompute scheduler (vllm-project#5900) [CI] Add workflow support for lint image build (vllm-project#6489) [Bugfix] Fix problematic dummy_run & improper input_batch_size in eagle (vllm-project#6517) [Refactor]310p_e2e test case update (vllm-project#6539) [Refactor]refactor p2p connector (vllm-project#6551) [Refactor]refactor 310p attention impl and add ut (vllm-project#6579) [Refactor]refactor 310p ops and add ut (vllm-project#6591) [Ops][Refactor] Remove custom rotary_embedding operator (vllm-project#6523) [Lint]Style: Convert `vllm-ascend/` to ruff format(new Batch vllm-project#8) (vllm-project#6604) [Test] Add initial multi modal cases of Qwen2.5-VL-7B-Instruct for disaggregated encoder (vllm-project#5301) [CI] Fix broken CI (vllm-project#6599) [Lint]Style: Convert `vllm-ascend/` to ruff format(Batch vllm-project#10) (vllm-project#6173) [Lint]Style: Convert `vllm-ascend/` to ruff format(Batch vllm-project#11) (vllm-project#6176) [Lint]Style: Convert `vllm-ascend/` to ruff format(Batch vllm-project#8) (vllm-project#6129) [Lint]Style: Convert `vllm-ascend/` to ruff format(Batch vllm-project#7) (vllm-project#6023) [CI][Misc] Some improvement for github action (vllm-project#6587) [Image] Bump mooncake version to v0.3.8.post1 (vllm-project#6428)

### What this PR does / why we need it? Part of vllm-project#5304. After vllm-project/vllm#32523 merge, we could remove the patch of `MiniCPMAttention`. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Test it locally. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 --------- Signed-off-by: gcanlin <canlinguosdu@gmail.com> Signed-off-by: momochenchuw <chenchuw@huawei.com>

### What this PR does / why we need it? Part of vllm-project#5304. After vllm-project/vllm#32523 merge, we could remove the patch of `MiniCPMAttention`. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Test it locally. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 --------- Signed-off-by: gcanlin <canlinguosdu@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

### What this PR does / why we need it? Part of vllm-project#5304. After vllm-project/vllm#32523 merge, we could remove the patch of `MiniCPMAttention`. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Test it locally. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 --------- Signed-off-by: gcanlin <canlinguosdu@gmail.com>

### What this PR does / why we need it? Part of vllm-project#5304. After vllm-project/vllm#32523 merge, we could remove the patch of `MiniCPMAttention`. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Test it locally. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 --------- Signed-off-by: gcanlin <canlinguosdu@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

### What this PR does / why we need it? Part of vllm-project#5304. After vllm-project/vllm#32523 merge, we could remove the patch of `MiniCPMAttention`. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Test it locally. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 --------- Signed-off-by: gcanlin <canlinguosdu@gmail.com>

[Patch] Remove the patch of MiniCPM

f771420

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

gcanlin requested a review from wangxiyuan as a code owner January 17, 2026 16:10

gemini-code-assist bot reviewed Jan 17, 2026

View reviewed changes

gcanlin mentioned this pull request Jan 17, 2026

[Model] Remove the unnecessary dtype conversion in MiniCPM vllm-project/vllm#32523

Merged

5 tasks

gcanlin added 3 commits January 17, 2026 16:19

remove the plan

b929473

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

clean import

df31243

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

remove test

e51c99d

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

wangxiyuan mentioned this pull request Jan 19, 2026

[RFC]: Env, AdditionalConfig and Patch cleanup #5304

Open

github-actions bot added the merge-conflicts label Jan 29, 2026

rebase

b7e6e4a

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

github-actions bot removed the merge-conflicts label Feb 9, 2026

wangxiyuan merged commit b7aa511 into vllm-project:main Feb 9, 2026
25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Patch] Remove the patch of MiniCPM#5975

[Patch] Remove the patch of MiniCPM#5975
wangxiyuan merged 5 commits intovllm-project:mainfrom
gcanlin:rm-patch-minicpm

gcanlin commented Jan 17, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jan 17, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gcanlin commented Jan 18, 2026

Uh oh!

wangxiyuan commented Jan 19, 2026

Uh oh!

gcanlin commented Jan 27, 2026

Uh oh!

wangxiyuan commented Jan 27, 2026

Uh oh!

gcanlin commented Jan 27, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 29, 2026

Uh oh!

wangxiyuan commented Feb 9, 2026

Uh oh!

gcanlin commented Feb 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gcanlin commented Jan 17, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Jan 17, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

vllm_ascend/patch/worker/patch_minicpm.py (18-36)

Uh oh!

gcanlin commented Jan 18, 2026

Uh oh!

wangxiyuan commented Jan 19, 2026

Uh oh!

gcanlin commented Jan 27, 2026

Uh oh!

wangxiyuan commented Jan 27, 2026

Uh oh!

gcanlin commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 29, 2026

Uh oh!

wangxiyuan commented Feb 9, 2026

Uh oh!

gcanlin commented Feb 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gcanlin commented Jan 17, 2026 •

edited by github-actions bot

Loading

gcanlin commented Jan 27, 2026 •

edited

Loading