Skip to content

[Bugfix]Fixed precision issues caused by pooled request pooling#6049

Merged
wangxiyuan merged 8 commits intovllm-project:mainfrom
DreamerLeader:bug_fix_0120
Jan 20, 2026
Merged

[Bugfix]Fixed precision issues caused by pooled request pooling#6049
wangxiyuan merged 8 commits intovllm-project:mainfrom
DreamerLeader:bug_fix_0120

Conversation

@DreamerLeader
Copy link
Copy Markdown
Contributor

@DreamerLeader DreamerLeader commented Jan 20, 2026

What this PR does / why we need it?

Fixed precision issues caused by pooled request pooling

Does this PR introduce any user-facing change?

pr6045

How was this patch tested?

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@DreamerLeader DreamerLeader changed the title Fixed precision issues caused by pooled request pooling [Bugfix]Fixed precision issues caused by pooled request pooling Jan 20, 2026
房建伟 added 3 commits January 20, 2026 18:01
Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
@LCAIZJ LCAIZJ added ready read for review ready-for-test start test by label for PR labels Jan 20, 2026
@LCAIZJ
Copy link
Copy Markdown
Collaborator

LCAIZJ commented Jan 20, 2026

LGTM

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

房建伟 and others added 2 commits January 20, 2026 23:07
房建伟 added 2 commits January 20, 2026 23:11
Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
zzzzwwjj pushed a commit that referenced this pull request Jan 20, 2026
### What this PR does / why we need it?
#6049
Fixed precision issues caused by pooled request pooling
### Does this PR introduce _any_ user-facing change?
pr6045
### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
@wangxiyuan wangxiyuan merged commit b6d55fc into vllm-project:main Jan 20, 2026
18 checks passed
845473182 pushed a commit to 845473182/vllm-ascend that referenced this pull request Jan 21, 2026
…to FIA_rebase

* 'main' of https://github.com/vllm-project/vllm-ascend: (24 commits)
  add dispath_ffn_combine_bf16 (vllm-project#5866)
  [BugFix] Fix input parameter bug of dispatch_gmm_combine_decode[RFC: issue 5476] (vllm-project#5932)
  [1/N][Feat] Xlite Qwen3 MoE Support (vllm-project#5951)
  [Bugfix] Fix setting of `speculative_config.enforce_eager` for dsv32 (vllm-project#5945)
  [bugfix][mm] change get_num_encoder_tokens to get_num_encoder_embeds in recompute_schedule.py (vllm-project#5132)
  [Bugfix] fix pcp qwen full graph FIA bug (vllm-project#6037)
  [Bugfix]Fixed precision issues caused by pooled request pooling (vllm-project#6049)
  【main】【bugfix】Resolved memory deallocation failure in the pooling layer under re-computation workloads. (vllm-project#6045)
  [main][Bugfix] Fixed an problem related to embeddings sharing (vllm-project#5967)
  [Feature]refactor the npugraph_ex config, support online-infer with static kernel (vllm-project#5775)
  [CI][Lint] Show lint diff on failure (vllm-project#5956)
  [CI] Add wait logic for each individual case (vllm-project#6036)
  [CI] Add DeepSeek-V3.2-W8A8 nightly ci test (vllm-project#4633)
  model runner v2 support triton of penalty (vllm-project#5854)
  [Docs][Model] Support Qwen3-VL-Embedding & Qwen3-VL-Reranker (vllm-project#6034)
  [Tests] move qwen3 performance test from nightly to e2e (vllm-project#5980)
  [Bugfix] fix bug of pcp+mtp+async scheduler (vllm-project#5994)
  [Main2Main] Upgrade vllm commit to releases/v0.14.0 (vllm-project#5988)
  [Ops] Add layernorm for qwen3Next (vllm-project#5765)
  [Doc] Add layer_sharding additional config for DeepSeek-V3.2-W8A8 (vllm-project#5921)
  ...
huangfeifei1995 pushed a commit to huangfeifei1995/vllm-ascend that referenced this pull request Jan 21, 2026
…-project#6049)

### What this PR does / why we need it?
Fixed precision issues caused by pooled request pooling
### Does this PR introduce _any_ user-facing change?
pr6045
### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Signed-off-by: huangning1995 <huangning12@huawei.com>
huangfeifei1995 added a commit to huangfeifei1995/vllm-ascend that referenced this pull request Jan 21, 2026
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
…-project#6049)

### What this PR does / why we need it?
Fixed precision issues caused by pooled request pooling
### Does this PR introduce _any_ user-facing change?
pr6045
### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
…-project#6057)

### What this PR does / why we need it?
vllm-project#6049
Fixed precision issues caused by pooled request pooling
### Does this PR introduce _any_ user-facing change?
pr6045
### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
…-project#6049)

### What this PR does / why we need it?
Fixed precision issues caused by pooled request pooling
### Does this PR introduce _any_ user-facing change?
pr6045
### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
whx-sjtu pushed a commit that referenced this pull request Feb 4, 2026
### What this PR does / why we need it?
Fix of Pooling Code and Update of Pooling Usage Guide
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
pr:[[Bugfix]Fixed precision issues caused by pooled request
pooling](#6049)
readyhttps://github.com//pull/6049
read for review
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Signed-off-by: fangjianwei <f30058701@china.huawei.com>
Signed-off-by: DreamerLeader <88812830+DreamerLeader@users.noreply.github.com>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: fangjianwei <f30058701@china.huawei.com>
chenchuw886 pushed a commit to chenchuw886/vllm-ascend that referenced this pull request Feb 12, 2026
…roject#6126)

### What this PR does / why we need it?
Fix of Pooling Code and Update of Pooling Usage Guide
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
pr:[[Bugfix]Fixed precision issues caused by pooled request
pooling](vllm-project#6049)
readyhttps://github.com/vllm-project/pull/6049
read for review
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Signed-off-by: fangjianwei <f30058701@china.huawei.com>
Signed-off-by: DreamerLeader <88812830+DreamerLeader@users.noreply.github.com>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: fangjianwei <f30058701@china.huawei.com>
Signed-off-by: momochenchuw <chenchuw@huawei.com>
tangtiangu pushed a commit to tangtiangu/jiusi-vllm-ascend that referenced this pull request Feb 24, 2026
…-project#6057)

### What this PR does / why we need it?
vllm-project#6049
Fixed precision issues caused by pooled request pooling
### Does this PR introduce _any_ user-facing change?
pr6045
### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
tangtiangu pushed a commit to tangtiangu/jiusi-vllm-ascend that referenced this pull request Feb 24, 2026
…-project#6057)

### What this PR does / why we need it?
vllm-project#6049
Fixed precision issues caused by pooled request pooling
### Does this PR introduce _any_ user-facing change?
pr6045
### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
…-project#6049)

### What this PR does / why we need it?
Fixed precision issues caused by pooled request pooling
### Does this PR introduce _any_ user-facing change?
pr6045
### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
…roject#6126)

### What this PR does / why we need it?
Fix of Pooling Code and Update of Pooling Usage Guide
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
pr:[[Bugfix]Fixed precision issues caused by pooled request
pooling](vllm-project#6049)
readyhttps://github.com/vllm-project/pull/6049
read for review
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Signed-off-by: fangjianwei <f30058701@china.huawei.com>
Signed-off-by: DreamerLeader <88812830+DreamerLeader@users.noreply.github.com>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: fangjianwei <f30058701@china.huawei.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
…-project#6049)

### What this PR does / why we need it?
Fixed precision issues caused by pooled request pooling
### Does this PR introduce _any_ user-facing change?
pr6045
### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
…roject#6126)

### What this PR does / why we need it?
Fix of Pooling Code and Update of Pooling Usage Guide
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
pr:[[Bugfix]Fixed precision issues caused by pooled request
pooling](vllm-project#6049)
readyhttps://github.com/vllm-project/pull/6049
read for review
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Signed-off-by: fangjianwei <f30058701@china.huawei.com>
Signed-off-by: DreamerLeader <88812830+DreamerLeader@users.noreply.github.com>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: fangjianwei <f30058701@china.huawei.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
…-project#6049)

### What this PR does / why we need it?
Fixed precision issues caused by pooled request pooling
### Does this PR introduce _any_ user-facing change?
pr6045
### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
…roject#6126)

### What this PR does / why we need it?
Fix of Pooling Code and Update of Pooling Usage Guide
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
pr:[[Bugfix]Fixed precision issues caused by pooled request
pooling](vllm-project#6049)
readyhttps://github.com/vllm-project/pull/6049
read for review
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Signed-off-by: fangjianwei <f30058701@china.huawei.com>
Signed-off-by: DreamerLeader <88812830+DreamerLeader@users.noreply.github.com>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: fangjianwei <f30058701@china.huawei.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
…-project#6049)

### What this PR does / why we need it?
Fixed precision issues caused by pooled request pooling
### Does this PR introduce _any_ user-facing change?
pr6045
### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
…roject#6126)

### What this PR does / why we need it?
Fix of Pooling Code and Update of Pooling Usage Guide
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
pr:[[Bugfix]Fixed precision issues caused by pooled request
pooling](vllm-project#6049)
readyhttps://github.com/vllm-project/pull/6049
read for review
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Signed-off-by: fangjianwei <f30058701@china.huawei.com>
Signed-off-by: DreamerLeader <88812830+DreamerLeader@users.noreply.github.com>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: fangjianwei <f30058701@china.huawei.com>
jiangyunfan1 pushed a commit to jiangyunfan1/vllm-ascend that referenced this pull request Apr 9, 2026
…roject#6126)

### What this PR does / why we need it?
Fix of Pooling Code and Update of Pooling Usage Guide
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
pr:[[Bugfix]Fixed precision issues caused by pooled request
pooling](vllm-project#6049)
readyhttps://github.com/vllm-project/pull/6049
read for review
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Signed-off-by: fangjianwei <f30058701@china.huawei.com>
Signed-off-by: DreamerLeader <88812830+DreamerLeader@users.noreply.github.com>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: fangjianwei <f30058701@china.huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants