[Image] Refactor image build by Potabk · Pull Request #5175 · vllm-project/vllm-ascend

Potabk · 2025-12-18T13:14:49Z

What this PR does / why we need it?

In the past time, we used a hybrid architecture cross-compilation approach for image building. This method had a problem: cross-compilation performance was very poor, leading to extremely long build times(abort 4h) and even a probability of failure(see https://github.com/vllm-project/vllm-ascend/actions/runs/20152861650/job/57849208186). Therefore, I recommend using a separate architecture build followed by manifest merging, which significantly reduces image build time(20min).

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.12.0
vLLM main: vllm-project/vllm@ad32e3e

gemini-code-assist · 2025-12-18T13:14:54Z

Note

Gemini is unable to generate a review for this pull request due to the file types involved not being currently supported.

Signed-off-by: wangli <wangli858794774@gmail.com>

github-actions · 2025-12-18T14:01:58Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: wangli <wangli858794774@gmail.com>

Potabk · 2025-12-19T03:16:16Z

https://github.com/nv-action/vllm-benchmarks/actions/runs/20358096054/job/58497645337 test it in my repo

Potabk · 2025-12-19T03:17:14Z

alse cc @Yikun

Signed-off-by: wangli <wangli858794774@gmail.com>

…to eplb_refactor * 'main' of https://github.com/vllm-project/vllm-ascend: (52 commits) [Doc]Add the user_guide doc file regarding fine-grained TP. (vllm-project#5084) [pref] qwen3_next add triton ops : fused_sigmoid_gating_delta_rule_update (vllm-project#4818) [Feature] Add token mask for DispatchGmmCombineDecode operator (vllm-project#5171) [CI] Improve CI (vllm-project#5078) [Refactor] remove some metadata variables in attention_v1. (vllm-project#5160) Add Qwen3-VL-235B-A22B-Instruct tutorials (vllm-project#5167) [Doc] Add a perf tune section (vllm-project#5127) [Image] Refactor image build (vllm-project#5175) [refactor] refactor weight trans nz and transpose (vllm-project#4878) [BugFix]Fix precision issue for LoRA feature (vllm-project#4141) 【Doc】Deepseekv3.1/R1 doc enhancement (vllm-project#4827) support basic long_seq feature st (vllm-project#5140) [Bugfix] install trition for test_custom_op (vllm-project#5112) [2/N][Pangu][MoE] Remove Pangu Related Code (vllm-project#5130) [bugfix] Use FUSED_MC2 MoE comm path for the op `dispatch_ffn_combine` (vllm-project#5156) [BugFix] Fix top_p,top_k issue with EAGLE and add top_p,top_k in EAGLE e2e (vllm-project#5131) [Doc][P/D] Fix MooncakeConnector's name (vllm-project#5172) [Bugfix] Fix in_profile_run in mtp_proposer dummy_run (vllm-project#5165) [Doc] Refact benchmark doc (vllm-project#5173) [Nightly] Avoid max_model_len being smaller than the decoder prompt to prevent single-node-accuray-tests from failing (vllm-project#5174) ... Signed-off-by: 白永斌 <baiyongbin3@h-partners.com>

### What this PR does / why we need it? Some tiny bugfix for #5175 Signed-off-by: wangli <wangli858794774@gmail.com>

### What this PR does / why we need it? In the past time, we used a hybrid architecture cross-compilation approach for image building. This method had a problem: cross-compilation performance was very poor, leading to extremely long build times(abort 4h) and even a probability of failure(see https://github.com/vllm-project/vllm-ascend/actions/runs/20152861650/job/57849208186). Therefore, I recommend using a separate architecture build followed by manifest merging, which significantly reduces image build time(20min). - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: wangli <wangli858794774@gmail.com>

### What this PR does / why we need it? Some tiny bugfix for vllm-project#5175 Signed-off-by: wangli <wangli858794774@gmail.com>

### What this PR does / why we need it? In the past time, we used a hybrid architecture cross-compilation approach for image building. This method had a problem: cross-compilation performance was very poor, leading to extremely long build times(abort 4h) and even a probability of failure(see https://github.com/vllm-project/vllm-ascend/actions/runs/20152861650/job/57849208186). Therefore, I recommend using a separate architecture build followed by manifest merging, which significantly reduces image build time(20min). - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

### What this PR does / why we need it? Some tiny bugfix for vllm-project#5175 Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

### What this PR does / why we need it? In the past time, we used a hybrid architecture cross-compilation approach for image building. This method had a problem: cross-compilation performance was very poor, leading to extremely long build times(abort 4h) and even a probability of failure(see https://github.com/vllm-project/vllm-ascend/actions/runs/20152861650/job/57849208186). Therefore, I recommend using a separate architecture build followed by manifest merging, which significantly reduces image build time(20min). - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

### What this PR does / why we need it? Some tiny bugfix for vllm-project#5175 Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

refact image build

7503e48

Signed-off-by: wangli <wangli858794774@gmail.com>

Potabk force-pushed the image branch from 5760660 to 7503e48 Compare December 18, 2025 13:15

github-actions bot added the ci/build label Dec 18, 2025

wangxiyuan added ready read for review ready-for-test start test by label for PR labels Dec 18, 2025

download digests separate

aad22ba

Signed-off-by: wangli <wangli858794774@gmail.com>

Potabk removed ready read for review ready-for-test start test by label for PR labels Dec 19, 2025

add label trigger

febe1f8

Signed-off-by: wangli <wangli858794774@gmail.com>

Potabk added ready read for review ready-for-test start test by label for PR and removed ready read for review ready-for-test start test by label for PR labels Dec 19, 2025

Potabk requested review from Yikun and wangxiyuan December 19, 2025 03:16

Potabk force-pushed the image branch from 8a5e08c to 6b0c0c6 Compare December 19, 2025 03:42

use the right path

62d65ff

Signed-off-by: wangli <wangli858794774@gmail.com>

Potabk force-pushed the image branch from 6b0c0c6 to 62d65ff Compare December 19, 2025 03:43

wangxiyuan approved these changes Dec 19, 2025

View reviewed changes

wangxiyuan merged commit a6eaf81 into vllm-project:main Dec 19, 2025
28 checks passed

Potabk deleted the image branch December 19, 2025 06:42

Potabk mentioned this pull request Dec 19, 2025

[CI] Fix image merge bug #5197

Merged

wangxiyuan pushed a commit that referenced this pull request Dec 19, 2025

[CI] Fix image merge bug (#5197)

14931d2

### What this PR does / why we need it? Some tiny bugfix for #5175 Signed-off-by: wangli <wangli858794774@gmail.com>

chenaoxuan pushed a commit to chenaoxuan/vllm-ascend that referenced this pull request Dec 20, 2025

[CI] Fix image merge bug (vllm-project#5197)

9e11c7b

### What this PR does / why we need it? Some tiny bugfix for vllm-project#5175 Signed-off-by: wangli <wangli858794774@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Image] Refactor image build#5175

[Image] Refactor image build#5175
wangxiyuan merged 4 commits intovllm-project:mainfrom
Potabk:image

Potabk commented Dec 18, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Dec 18, 2025

Uh oh!

github-actions bot commented Dec 18, 2025

Uh oh!

Potabk commented Dec 19, 2025

Uh oh!

Potabk commented Dec 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Potabk commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot commented Dec 18, 2025

Uh oh!

github-actions bot commented Dec 18, 2025

Uh oh!

Potabk commented Dec 19, 2025

Uh oh!

Potabk commented Dec 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Potabk commented Dec 18, 2025 •

edited

Loading