add dispath_ffn_combine_bf16 by guanguan0308 · Pull Request #5866 · vllm-project/vllm-ascend

guanguan0308 · 2026-01-13T12:06:22Z

What this PR does / why we need it?

add dispath_ffn_combine_bf16

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.13.0
vLLM main: vllm-project/vllm@bde38c1

github-actions · 2026-01-13T12:06:37Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request introduces a new operator dispatch_ffn_combine_bf16 for Mixture-of-Experts models on the CANN platform. The changes are extensive, covering operator definition, host and device-side implementations, PyTorch bindings, and tests. However, I've identified several critical issues related to correctness and potential runtime failures, including incorrect template instantiations, wrong PyTorch bindings, potential buffer overflows due to fixed-size arrays, and incomplete operator prototype implementations. These issues must be addressed to ensure the operator functions correctly and safely.

Signed-off-by: guanguan0308 <1546542263@qq.com>

…to FIA_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: (24 commits) add dispath_ffn_combine_bf16 (vllm-project#5866) [BugFix] Fix input parameter bug of dispatch_gmm_combine_decode[RFC: issue 5476] (vllm-project#5932) [1/N][Feat] Xlite Qwen3 MoE Support (vllm-project#5951) [Bugfix] Fix setting of `speculative_config.enforce_eager` for dsv32 (vllm-project#5945) [bugfix][mm] change get_num_encoder_tokens to get_num_encoder_embeds in recompute_schedule.py (vllm-project#5132) [Bugfix] fix pcp qwen full graph FIA bug (vllm-project#6037) [Bugfix]Fixed precision issues caused by pooled request pooling (vllm-project#6049) 【main】【bugfix】Resolved memory deallocation failure in the pooling layer under re-computation workloads. (vllm-project#6045) [main][Bugfix] Fixed an problem related to embeddings sharing (vllm-project#5967) [Feature]refactor the npugraph_ex config, support online-infer with static kernel (vllm-project#5775) [CI][Lint] Show lint diff on failure (vllm-project#5956) [CI] Add wait logic for each individual case (vllm-project#6036) [CI] Add DeepSeek-V3.2-W8A8 nightly ci test (vllm-project#4633) model runner v2 support triton of penalty (vllm-project#5854) [Docs][Model] Support Qwen3-VL-Embedding & Qwen3-VL-Reranker (vllm-project#6034) [Tests] move qwen3 performance test from nightly to e2e (vllm-project#5980) [Bugfix] fix bug of pcp+mtp+async scheduler (vllm-project#5994) [Main2Main] Upgrade vllm commit to releases/v0.14.0 (vllm-project#5988) [Ops] Add layernorm for qwen3Next (vllm-project#5765) [Doc] Add layer_sharding additional config for DeepSeek-V3.2-W8A8 (vllm-project#5921) ...

### What this PR does / why we need it? add dispath_ffn_combine_bf16 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 --------- Signed-off-by: guanguan0308 <1546542263@qq.com> Signed-off-by: huangning1995 <huangning12@huawei.com>

This reverts commit 2073197.

### What this PR does / why we need it? add dispath_ffn_combine_bf16 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 --------- Signed-off-by: guanguan0308 <1546542263@qq.com>

### What this PR does / why we need it? add dispath_ffn_combine_bf16 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 --------- Signed-off-by: guanguan0308 <1546542263@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

### What this PR does / why we need it? add dispath_ffn_combine_bf16 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 --------- Signed-off-by: guanguan0308 <1546542263@qq.com>

### What this PR does / why we need it? add dispath_ffn_combine_bf16 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 --------- Signed-off-by: guanguan0308 <1546542263@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

### What this PR does / why we need it? add dispath_ffn_combine_bf16 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 --------- Signed-off-by: guanguan0308 <1546542263@qq.com>

github-actions bot added the module:tests label Jan 13, 2026

gemini-code-assist bot reviewed Jan 13, 2026

View reviewed changes

guanguan0308 changed the title ~~fix~~ add dispath_ffn_combine_bf16 Jan 13, 2026

guanguan0308 requested review from wangxiyuan and zzzzwwjj as code owners January 15, 2026 13:00

guanguan0308 force-pushed the dispath_ffn_combine_bf16_3 branch 3 times, most recently from e729497 to 0cb311a Compare January 16, 2026 02:27

wangxiyuan added ready read for review ready-for-test start test by label for PR labels Jan 16, 2026

guanguan0308 requested review from LCAIZJ, MengqingCao, Yikun, realliujiaxu, weijinqian0 and yiz-liu as code owners January 19, 2026 03:01

add dispathcffncombinebf16

d561470

Signed-off-by: guanguan0308 <1546542263@qq.com>

guanguan0308 force-pushed the dispath_ffn_combine_bf16_3 branch from 175ce0b to d561470 Compare January 19, 2026 03:29

ci: trigger CI rebuild

e4e76cd

Signed-off-by: guanguan0308 <1546542263@qq.com>

guanguan0308 force-pushed the dispath_ffn_combine_bf16_3 branch from 12d5c9d to e4e76cd Compare January 20, 2026 02:15

weijinqian0 approved these changes Jan 20, 2026

View reviewed changes

wangxiyuan merged commit 1ed9524 into vllm-project:main Jan 21, 2026
20 checks passed

huangfeifei1995 added a commit to huangfeifei1995/vllm-ascend that referenced this pull request Jan 21, 2026

Revert "add dispath_ffn_combine_bf16 (vllm-project#5866)"

3ea0035

This reverts commit 2073197.

guanguan0308 deleted the dispath_ffn_combine_bf16_3 branch March 13, 2026 08:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add dispath_ffn_combine_bf16#5866

add dispath_ffn_combine_bf16#5866
wangxiyuan merged 2 commits intovllm-project:mainfrom
guanguan0308:dispath_ffn_combine_bf16_3

guanguan0308 commented Jan 13, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 13, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

guanguan0308 commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Jan 13, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

guanguan0308 commented Jan 13, 2026 •

edited

Loading