[5/N][Attention] Finish eliminating `vllm/attention` folder by MatthewBonanni · Pull Request #32064 · vllm-project/vllm

MatthewBonanni · 2026-01-09T23:03:42Z

Merge #32060 before this.

Purpose

Step 5 of #31919: This PR finishes eliminating the vllm/attention folder by doing the following:

Split vllm/attention/layer.py into vllm/model_executor/layers/attention/mla_attention.py (MLAAttention, unified_mla_attention) and vllm/model_executor/layers/attention/attention.py (Attention, unified_attention)
Move vllm/attention/utils/kv_sharing_utils.py content into vllm/model_executor/layers/attention/attention.py
Move vllm/attention/utils/kv_transfer_utils.py to vllm/model_executor/layers/attention/kv_transfer_utils.py
Eliminate vllm/attention folder
Add imports to vllm/model_executor/layers/attention/__init__.py to enable module-level imports

CI (should run all tests)

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Note

Completes migration away from vllm/attention by relocating attention layers and utilities under vllm/model_executor.

Split former vllm/attention/layer.py into vllm/model_executor/layers/attention/attention.py (Attention, unified_attention) and .../mla_attention.py (MLAAttention, unified_mla_attention)
Inlined validate_kv_sharing_target from vllm/attention/utils/kv_sharing_utils.py into .../attention.py; moved kv_transfer_utils to .../layers/attention/kv_transfer_utils.py
Removed vllm/attention package; updated imports across models, backends, workers, quantization, tests, docs, and CODEOWNERS; adjusted CI source_file_dependencies
Minor type hints and custom-op registrations updated to reflect new module layout

^{Written by Cursor Bugbot for commit 99e5293. This will update automatically on new commits. Configure here.}

Note

Completes migration away from vllm/attention by relocating attention code into vllm/model_executor.

Split former vllm/attention/layer.py into .../attention/attention.py (Attention, unified_attention) and .../attention/mla_attention.py (MLAAttention, unified_mla_attention)
Inlined validate_kv_sharing_target and moved kv_transfer_utils into .../layers/attention; removed vllm/attention and old utilities
Updated imports across models, quantization, backends, workers, tests, and docs; adjusted CODEOWNERS and Buildkite test dependencies to new paths
Minor type hints and custom-op registrations updated to match new module layout

^{Written by Cursor Bugbot for commit 9942dff. This will update automatically on new commits. Configure here.}

Note

^{Cursor Bugbot is generating a summary for commit 3111dd0. Configure here.}

Note

Completes migration away from vllm/attention to vllm/model_executor.

Splits former vllm/attention/layer.py into vllm/model_executor/layers/attention/attention.py (Attention, unified_attention) and .../mla_attention.py (MLAAttention, unified_mla_attention), moving MLA custom-ops there
Inlines validate_kv_sharing_target into .../attention.py and moves kv_transfer_utils to .../layers/attention/kv_transfer_utils.py; deletes vllm/attention and vllm/attention/utils/kv_sharing_utils.py
Updates imports across models, quantization, compilers, backends, workers, tests, and docs to new paths; adjusts CODEOWNERS and Buildkite test dependencies
Minor typing and API touch-ups (e.g., get_attention_context annotations, TYPE_CHECKING) to match new layout

^{Written by Cursor Bugbot for commit 3111dd0. This will update automatically on new commits. Configure here.}

Note

^{Cursor Bugbot is generating a summary for commit 8b56809. Configure here.}

Note

Completes migration away from vllm/attention to vllm/model_executor.

Splits vllm/attention/layer.py into .../attention/attention.py (Attention, unified_attention) and .../attention/mla_attention.py (MLAAttention, unified_mla_attention); moves MLA custom-ops
Inlines validate_kv_sharing_target and moves kv_transfer_utils into .../layers/attention; deletes vllm/attention and vllm/attention/utils/kv_sharing_utils.py
Mass import path updates across models, quantization, compilers, backends, workers, tests, and docs; minor typing tweaks (TYPE_CHECKING, annotations)
CI/config updates: Buildkite test dependencies reference new paths; CODEOWNERS updated for .../layers/attention; mypy config stops listing the removed package

^{Written by Cursor Bugbot for commit 8b56809. This will update automatically on new commits. Configure here.}