Skip to content

Reapply "[Refactor] Unify full-graph parameter update logic (#6041)" (#6227)#6231

Merged
wangxiyuan merged 1 commit intovllm-project:mainfrom
wangxiyuan:revert_revert
Jan 26, 2026
Merged

Reapply "[Refactor] Unify full-graph parameter update logic (#6041)" (#6227)#6231
wangxiyuan merged 1 commit intovllm-project:mainfrom
wangxiyuan:revert_revert

Conversation

@wangxiyuan
Copy link
Copy Markdown
Collaborator

@wangxiyuan wangxiyuan commented Jan 25, 2026

This reverts commit 9564934.

The CI failure doesn't related to this change. Let's reapply it.

@wangxiyuan wangxiyuan added ready read for review ready-for-test start test by label for PR labels Jan 25, 2026
@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the full-graph parameter update logic by moving implementation-specific details into their respective attention backend classes and using a unified dispatch function. This is a good architectural improvement that enhances modularity. However, I've identified a critical issue where draft_attn_metadatas is not correctly propagated in the new implementation, which could lead to incorrect behavior for draft models in graph mode. I've provided a suggestion to address this.

Comment on lines +1181 to +1183
update_full_graph_params(
self.runner.attn_backend, self.update_stream, forward_context, num_tokens,
self.vllm_config, self.vllm_config.speculative_config)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The draft_attn_metadatas parameter is passed into _update_full_graph_params but is not used. The new update_full_graph_params function does not accept it as an argument, and the underlying implementation in AscendAttentionBackendImpl.update_graph_params expects this data to be in forward_context.draft_attn_metadatas. By not setting it on the context, this crucial metadata for the draft model is lost, which will likely cause incorrect behavior or errors when running in graph mode. Please set draft_attn_metadatas on the forward_context before calling update_full_graph_params.

Suggested change
update_full_graph_params(
self.runner.attn_backend, self.update_stream, forward_context, num_tokens,
self.vllm_config, self.vllm_config.speculative_config)
forward_context.draft_attn_metadatas = draft_attn_metadatas
update_full_graph_params(
self.runner.attn_backend, self.update_stream, forward_context, num_tokens,
self.vllm_config, self.vllm_config.speculative_config)

@wangxiyuan
Copy link
Copy Markdown
Collaborator Author

@wangxiyuan wangxiyuan merged commit 4e3919e into vllm-project:main Jan 26, 2026
62 of 64 checks passed
@wangxiyuan wangxiyuan deleted the revert_revert branch January 26, 2026 11:31
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
…ject#6041)" (vllm-project#6227) (vllm-project#6231)

This reverts commit 9564934.

The CI failure doesn't related to this change. Let's reapply it.

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
…ject#6041)" (vllm-project#6227) (vllm-project#6231)

This reverts commit 9564934.

The CI failure doesn't related to this change. Let's reapply it.

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094

Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
…ject#6041)" (vllm-project#6227) (vllm-project#6231)

This reverts commit 9564934.

The CI failure doesn't related to this change. Let's reapply it.

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
…ject#6041)" (vllm-project#6227) (vllm-project#6231)

This reverts commit 9564934.

The CI failure doesn't related to this change. Let's reapply it.

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094

Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
…ject#6041)" (vllm-project#6227) (vllm-project#6231)

This reverts commit 9564934.

The CI failure doesn't related to this change. Let's reapply it.

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094
jiangyunfan1 pushed a commit to jiangyunfan1/vllm-ascend that referenced this pull request Apr 9, 2026
…ject#6041)" (vllm-project#6227) (vllm-project#6231)

This reverts commit 9564934.

The CI failure doesn't related to this change. Let's reapply it.

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094
yangzhe-2026 pushed a commit to yangzhe-2026/vllm-ascend that referenced this pull request May 6, 2026
…ject#6041)" (vllm-project#6227) (vllm-project#6231)

This reverts commit 9564934.

The CI failure doesn't related to this change. Let's reapply it.

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:tests ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant