Add support for chunked attention (#597) by jkaniecki · Pull Request #683 · vllm-project/vllm-gaudi

jkaniecki · 2025-12-04T10:55:20Z

Cherry-pick of
6e1be4e

Cherry-pick of vllm-project@6e1be4e --------- Signed-off-by: Jan Kaniecki <jkaniecki@habana.ai> Signed-off-by: Jan Kaniecki <jan.kaniecki@intel.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

This PR adds support for chunked attention to the vLLM-Gaudi implementation, cherry-picked from the upstream vllm-gaudi repository. Chunked attention divides attention computation into smaller chunks, which can help with memory efficiency and performance for long sequences.

Key changes:

Added chunked attention bias computation for both prefill and decode phases
Extended attention metadata structures to include chunked attention fields
Integrated chunked attention configuration detection and layer setup

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
vllm_gaudi/v1/worker/hpu_model_runner.py	Core implementation of chunked attention including bias computation, block mapping, metadata updates, and model initialization logic
vllm_gaudi/v1/attention/backends/hpu_attn.py	Updated decode metadata factory method to accept chunked attention parameters
vllm_gaudi/attention/backends/hpu_attn.py	Added chunked attention metadata fields and logic to select appropriate attention blocks during forward pass

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

github-actions · 2025-12-04T17:27:06Z

✅ CI Passed

All checks passed successfully against the following vllm commit:
1b7c7f5159484063af28cb47809d79e83d3301ec

github-actions · 2025-12-09T14:37:42Z

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

github-actions · 2025-12-15T17:46:02Z

✅ CI Passed

All checks passed successfully against the following vllm commit:
e2ed238885be6af358be1851cd43105b7d036c49

github-actions · 2025-12-16T14:03:01Z

✅ CI Passed

All checks passed successfully against the following vllm commit:
17fec3af0942da83bcebe2ca0cb4f6ae81c634d8

PatrykWo · 2026-01-14T10:20:50Z

@kzawora-intel please review and approve after resolving conflicts

Luca-Calabria · 2026-01-26T16:25:40Z

This PR has been already merged here #821
You can close it

adobrzyn · 2026-02-04T11:00:28Z

#821 - wasn't this done here already?

Copilot AI review requested due to automatic review settings December 4, 2025 10:55

jkaniecki requested review from adobrzyn, afierka-intel, iboiko-habana, kamil-kaczor, ksmusz, kzawora-intel, mgawarkiewicz-intel, michalkuligowski and xuechendi as code owners December 4, 2025 10:55

Copilot AI reviewed Dec 4, 2025

View reviewed changes

Comment thread vllm_gaudi/v1/worker/hpu_model_runner.py

Comment thread vllm_gaudi/v1/worker/hpu_model_runner.py

Comment thread vllm_gaudi/attention/backends/hpu_attn.py

Comment thread vllm_gaudi/attention/backends/hpu_attn.py

Update hpu_attn.py

ffdb483

github-actions Bot mentioned this pull request Dec 8, 2025

🚦 Team Review Dashboard #701

Open

Update hpu_model_runner.py

555d898

Merge branch 'main' into main

06d8483

Merge branch 'main' into main

97741e7

jkaniecki closed this Feb 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for chunked attention (#597)#683

Add support for chunked attention (#597)#683
jkaniecki wants to merge 5 commits into
vllm-project:mainfrom
jkaniecki:main

jkaniecki commented Dec 4, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Dec 4, 2025

Uh oh!

github-actions Bot commented Dec 9, 2025

Uh oh!

github-actions Bot commented Dec 15, 2025

Uh oh!

github-actions Bot commented Dec 16, 2025

Uh oh!

PatrykWo commented Jan 14, 2026

Uh oh!

Luca-Calabria commented Jan 26, 2026

Uh oh!

adobrzyn commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

jkaniecki commented Dec 4, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Dec 4, 2025

✅ CI Passed

Uh oh!

github-actions Bot commented Dec 9, 2025

🚧 CI Blocked

Uh oh!

github-actions Bot commented Dec 15, 2025

✅ CI Passed

Uh oh!

github-actions Bot commented Dec 16, 2025

✅ CI Passed

Uh oh!

PatrykWo commented Jan 14, 2026

Uh oh!

Luca-Calabria commented Jan 26, 2026

Uh oh!

adobrzyn commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants