Skip to content

[ROCm] Fix KV copy methods and auto-select attention backend for ROCm#36845

Merged
tjtanaa merged 2 commits intovllm-project:mainfrom
ROCm:akaratza_fix_spec_dec
Mar 16, 2026
Merged

[ROCm] Fix KV copy methods and auto-select attention backend for ROCm#36845
tjtanaa merged 2 commits intovllm-project:mainfrom
ROCm:akaratza_fix_spec_dec

Conversation

@AndreasKaratzas
Copy link
Collaborator

@AndreasKaratzas AndreasKaratzas commented Mar 12, 2026

  • Added insert_blocks_to_device and swap_out_blocks_to_host to RocmPlatform. These were only defined on CudaPlatform, causing a TypeError: 'NoneType' object is not callable crash when NixlConnector tried to copy KV blocks between GPU and CPU buffers during prefill/decode disaggregation on ROCm.

  • Updated spec_decode_acceptance_test.sh to auto-select the attention backend based on the detected GPU platform: TRITON_ATTN on ROCm, FLASH_ATTN on NVIDIA. Previously the script hardcoded FLASH_ATTN regardless of platform. The backend can still be overridden via ATTENTION_BACKEND=<value>.

cc @kenroche

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
@mergify mergify bot added the rocm Related to AMD ROCm label Mar 12, 2026
@github-project-automation github-project-automation bot moved this to Todo in AMD Mar 12, 2026
@AndreasKaratzas AndreasKaratzas added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 12, 2026
@DarkLight1337
Copy link
Member

cc @tjtanaa

Copy link
Collaborator

@tjtanaa tjtanaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tjtanaa tjtanaa merged commit 911355e into vllm-project:main Mar 16, 2026
49 checks passed
@github-project-automation github-project-automation bot moved this from Todo to Done in AMD Mar 16, 2026
@AndreasKaratzas AndreasKaratzas deleted the akaratza_fix_spec_dec branch March 16, 2026 16:14
Lucaskabela pushed a commit to Lucaskabela/vllm that referenced this pull request Mar 17, 2026
wendyliu235 pushed a commit to wendyliu235/vllm-public that referenced this pull request Mar 18, 2026
fxdawnn pushed a commit to fxdawnn/vllm that referenced this pull request Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kv-connector ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm v1

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants