Skip to content

[Bugfix][Spec Decode] Avoid double call of Ngram CPU#36952

Merged
benchislett merged 1 commit intovllm-project:mainfrom
ekagra-ranjan:er-ngram-double
Mar 13, 2026
Merged

[Bugfix][Spec Decode] Avoid double call of Ngram CPU#36952
benchislett merged 1 commit intovllm-project:mainfrom
ekagra-ranjan:er-ngram-double

Conversation

@ekagra-ranjan
Copy link
Copy Markdown
Contributor

After this PR #29184, ngram CPU was being called twice

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
@ekagra-ranjan ekagra-ranjan requested a review from njhill as a code owner March 13, 2026 05:02
@mergify mergify bot added v1 bug Something isn't working labels Mar 13, 2026
@ekagra-ranjan ekagra-ranjan changed the title [Bugfix] Avoid double call of Ngram CPU [Bugfix][Spec Decode] Avoid double call of Ngram CPU Mar 13, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly removes a redundant call to self.drafter.propose within the propose_draft_token_ids method. This was causing the n-gram proposal logic to execute twice for the CPU-based NgramProposer. The removal of the duplicate code block resolves this performance issue. The fix is clean and accurate.

@benchislett benchislett added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 13, 2026
@benchislett benchislett enabled auto-merge (squash) March 13, 2026 17:26
@benchislett benchislett merged commit d0b4029 into vllm-project:main Mar 13, 2026
49 checks passed
Lucaskabela pushed a commit to Lucaskabela/vllm that referenced this pull request Mar 17, 2026
)

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
wendyliu235 pushed a commit to wendyliu235/vllm-public that referenced this pull request Mar 18, 2026
)

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
fxdawnn pushed a commit to fxdawnn/vllm that referenced this pull request Mar 19, 2026
)

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
khairulkabir1661 pushed a commit to khairulkabir1661/vllm that referenced this pull request Mar 27, 2026
)

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
Monishver11 pushed a commit to Monishver11/vllm that referenced this pull request Mar 27, 2026
)

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants