Skip to content

[cpu][bench] Add CPU paged attention benchmarks#31720

Merged
bigPYJ1151 merged 2 commits intovllm-project:mainfrom
fadara01:cpu_attn_benchmark
Jan 6, 2026
Merged

[cpu][bench] Add CPU paged attention benchmarks#31720
bigPYJ1151 merged 2 commits intovllm-project:mainfrom
fadara01:cpu_attn_benchmark

Conversation

@fadara01
Copy link
Copy Markdown
Contributor

@fadara01 fadara01 commented Jan 5, 2026

Purpose

Add CPU paged attention benchmarks
Fixes: #30374

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • [ Y] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify mergify bot added the performance Performance-related issues label Jan 5, 2026
@mergify mergify bot added the cpu Related to CPU backends label Jan 5, 2026
@fadara01
Copy link
Copy Markdown
Contributor Author

fadara01 commented Jan 5, 2026

@bigPYJ1151 could you please review?

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a benchmark script for CPU paged attention, which is a valuable addition for performance testing and optimization. The script is well-structured and provides a good range of configurable parameters. My review focuses on improving the robustness of the main benchmark function to prevent potential runtime errors if it's used in different contexts.

Fixes: vllm-project#30374

Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
Copy link
Copy Markdown
Member

@bigPYJ1151 bigPYJ1151 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@bigPYJ1151 bigPYJ1151 enabled auto-merge (squash) January 6, 2026 08:22
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 6, 2026
@bigPYJ1151 bigPYJ1151 merged commit 799b572 into vllm-project:main Jan 6, 2026
17 checks passed
LucasWilkinson pushed a commit to neuralmagic/vllm that referenced this pull request Jan 6, 2026
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cpu Related to CPU backends performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature][CPU Backend]: Add Paged Attention Benchmarks for CPU backend

2 participants