[Bugfix] add in_graph_capturing comment#5072
[Bugfix] add in_graph_capturing comment#5072dragondream-chen wants to merge 0 commit intovllm-project:mainfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request introduces a bugfix for speculative decoding with graph capture. The changes correctly propagate the graph capture status to the MTP proposer and adjust the logic within the proposer to only capture the first speculative token generation, which is the intended behavior. A helpful comment has also been added to clarify the purpose of the force_attention flag. The fix appears correct and addresses the issue effectively.
7ffa898 to
b7eb0cf
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
f3bf192 to
8de6049
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
8de6049 to
320877d
Compare
What this PR does / why we need it?
Add the value "in_graph_capturing " comment
Does this PR introduce any user-facing change?
How was this patch tested?