vllm-cpu attention fix by Meghagaur · Pull Request #250 · red-hat-data-services/vllm-cpu

Meghagaur · 2026-01-14T18:07:39Z

Reintroduce support for head dimensions 80 and 112 in CPU attention backend which were previously removed in vllm-project/vllm#27954 but these head dimensions are commonly used by granite models deployed on Z archs. Since these heads are not friendly for Intel AMX instruction set. The implementation now falls back to vec16.

Test Plan
Build Docker image and test using ibm-granite/granite-3b-code-base-2k model which has head size of 80.

Upstream PR - vllm-project/vllm#31968
Main branch PR - #251

Meghagaur · 2026-01-14T18:55:12Z

/build-konflux

Meghagaur · 2026-01-15T04:40:48Z

Hi @wznoinsk,
Could you please review this PR ? The Konflux build has passed.
This is required for vLLM CPU to work correctly on RHOAI 3.2.
Thank you

not setting `UV_EXTRA_INDEX_URL=https://download.pytorch.org/whl/cu128` caused torch+cu126 to be installed.

vllm-cpu attention fix

dc4ed64

Meghagaur requested a review from wznoinsk January 15, 2026 04:43

moulalis approved these changes Jan 15, 2026

View reviewed changes

rroshan-rh merged commit 7552823 into rhoai-3.2 Jan 15, 2026
2 checks passed

Shafi-Hussain pushed a commit to odh-on-pz/vllm-cpu that referenced this pull request Feb 9, 2026

Dockerfile.ubi: fix CUDA version (use 12.8) (red-hat-data-services#250)

e2c5bd7

not setting `UV_EXTRA_INDEX_URL=https://download.pytorch.org/whl/cu128` caused torch+cu126 to be installed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vllm-cpu attention fix#250

vllm-cpu attention fix#250
rroshan-rh merged 1 commit intorhoai-3.2from
cpu-atten-s390x

Meghagaur commented Jan 14, 2026 •

edited

Loading

Uh oh!

Meghagaur commented Jan 14, 2026

Uh oh!

Meghagaur commented Jan 15, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Meghagaur commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Meghagaur commented Jan 14, 2026

Uh oh!

Meghagaur commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Meghagaur commented Jan 14, 2026 •

edited

Loading

Meghagaur commented Jan 15, 2026 •

edited

Loading