Skip to content

Revert "[Bugfix] mamba: run single-token extends as decodes" (#42430)#43034

Draft
vllm-agent wants to merge 1 commit into
vllm-project:mainfrom
vllm-agent:auto-revert/pr-42430
Draft

Revert "[Bugfix] mamba: run single-token extends as decodes" (#42430)#43034
vllm-agent wants to merge 1 commit into
vllm-project:mainfrom
vllm-agent:auto-revert/pr-42430

Conversation

@vllm-agent

Copy link
Copy Markdown
Contributor

Revert of #42430

This reverts commit 47829b1 (PR #42430).

Reason: CI nightly build #66759 detected 1 new failure linked to this PR:

  • Hybrid SSM NixlConnector PD accuracy tests (4 GPUs): test_accuracy fails with AssertionError: Expected: 0.8 | Measured: 0.7498104624715694 — accuracy regressed from 0.80 to 0.75 for granite-4.0-h-tiny hybrid SSM model with HMA enabled.

Original PR: #42430


Auto-generated by CI failure analyzer.

@mergify mergify Bot added v1 bug Something isn't working kv-connector labels May 19, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes the logic in the Mamba attention backend that automatically converts single-token prefills with prior state into decodes. Along with this change, the PR removes associated tests and refactors the test suite by deleting the shared MockMambaBuilder utility in favor of a local concrete implementation within the test files. I have no feedback to provide as there were no review comments to assess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working kv-connector v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant