Skip to content

[FEAT] [AITER] [ROCm] integrate aiter bpreshuffle and ck ops#28837

Draft
vllmellm wants to merge 43 commits intovllm-project:mainfrom
EmbeddedLLM:refactor/aiter_integration
Draft

[FEAT] [AITER] [ROCm] integrate aiter bpreshuffle and ck ops#28837
vllmellm wants to merge 43 commits intovllm-project:mainfrom
EmbeddedLLM:refactor/aiter_integration

Conversation

@vllmellm
Copy link
Contributor

@vllmellm vllmellm commented Nov 17, 2025

Purpose

This PR integrates the Aiter bpreshuffle and ck FP8 operators for ROCm, targeting Per-Token/Per-Channel (PTPC) quantization.

  • Adds AiterBpreshufflePerTokenFp8ScaledMMLinearKernel and AiterCKPerTokenFp8ScaledMMLinearKernel.
  • Hooks these kernels into the FP8ScaledMMLinearLayerConfig selection logic.
  • Triggers the bpreshuffle weight shuffling from compressed_tensors_w8a8_fp8.py.

Test Plan

Testing is performed by launching the vLLM server with specific environment variables to select each kernel and running lm_eval. Including lm_eval and benchmark

Test Result

Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
… comment

Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
…ctor

Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
…or/aiter_integration

Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
@mergify mergify bot added the nvidia label Nov 17, 2025
@mergify
Copy link

mergify bot commented Nov 17, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @vllmellm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added rocm Related to AMD ROCm needs-rebase labels Nov 17, 2025
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
@mergify mergify bot removed the needs-rebase label Dec 2, 2025
@mergify mergify bot added the cpu Related to CPU backends label Dec 17, 2025
@mergify
Copy link

mergify bot commented Dec 17, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @vllmellm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cpu Related to CPU backends needs-rebase nvidia rocm Related to AMD ROCm

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant