Skip to content

[v0.13.0][Eagle3]Extend PR #5786 to eagle3#6443

Merged
wangxiyuan merged 1 commit intovllm-project:releases/v0.13.0from
Angazenn:bugfix_dev
Jan 30, 2026
Merged

[v0.13.0][Eagle3]Extend PR #5786 to eagle3#6443
wangxiyuan merged 1 commit intovllm-project:releases/v0.13.0from
Angazenn:bugfix_dev

Conversation

@Angazenn
Copy link
Copy Markdown
Collaborator

What this PR does / why we need it?

This PR extends #5786 to eagle3 spec decode when used with pd-disaggregation + async-scheduling.

Does this PR introduce any user-facing change?

How was this patch tested?

Signed-off-by: Angazenn <supperccell@163.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request extends a feature previously specific to the 'mtp' speculative decoding method to be applicable to other methods like 'eagle3', particularly in scenarios involving prefill/decode disaggregation and asynchronous scheduling. The changes in vllm_ascend/patch/worker/patch_model_runner.py and vllm_ascend/platform.py correctly generalize the logic by removing the explicit check for method == 'mtp'. This allows the necessary patch for handling draft tokens in async scheduling to be applied for any speculative decoding method when the required conditions are met. The changes are logical and align with the stated goal of the PR. I don't see any issues with the proposed changes.

@Angazenn Angazenn added ready read for review ready-for-test start test by label for PR labels Jan 30, 2026
@wangxiyuan wangxiyuan merged commit 2b34a1a into vllm-project:releases/v0.13.0 Jan 30, 2026
18 checks passed
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
…6443)

### What this PR does / why we need it?
This PR extends vllm-project#5786 to eagle3 spec decode when used with
pd-disaggregation + async-scheduling.

Signed-off-by: Angazenn <supperccell@163.com>
SkychenLee pushed a commit to SkychenLee/vllm-ascend that referenced this pull request Jan 31, 2026
…6443)

### What this PR does / why we need it?
This PR extends vllm-project#5786 to eagle3 spec decode when used with
pd-disaggregation + async-scheduling.

Signed-off-by: Angazenn <supperccell@163.com>
Signed-off-by: l00832868 <litianchen2@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants