fixed mypy warnings for files vllm/v1/attention with TEMPORARY workaround by MrIceCreamMan · Pull Request #31465 · vllm-project/vllm

MrIceCreamMan · 2025-12-29T02:20:33Z

Purpose

Follow-up to #26448

This PR improves type checking coverage for the attention module by:

Moving vllm/v1/attention from SEPARATE_GROUPS to FILES in mypy configuration
Fixing type errors in the attention module to enable strict type checking
This ensures better code quality and catches type-related bugs earlier

This is part of the effort to enable comprehensive type checking across the vLLM codebase.

Implementation Notes

Dynamic Module Type Checking Workaround

This PR includes # type: ignore[attr-defined] annotations for imports from the vllm.vllm_flash_attn module. This is necessary because vllm_flash_attn is a dynamically built C extension module that behaves differently in CI and local environments:

Local environment: The module is compiled during build/installation, making all attributes available to mypy.

CI environment: Pre-commit hooks run on a fresh checkout before the build step, leaving vllm_flash_attn as an empty stub, causing mypy to report missing attribute errors.

Future improvement: The ideal solution would be to update the CI workflow to build extensions before running type checking:

- name: Build vLLM extensions
  run: python setup.py build_ext --inplace

- name: Run pre-commit
  run: pre-commit run --all-files

This would eliminate the need for type ignore comments while maintaining comprehensive type checking coverage.

Test Plan

Type checking:

pre-commit run --hook-stage manual mypy-3.10 -a

Unit tests for attention module:

pytest tests/v1/attention/ -v -k "not (mla and (decode or mixed or spec_decode)) and not sparse_mla"

Pre-commit checks:

rm -rf .mypy_cache/
pre-commit clean
pre-commit run --all-files --hook-stage manual

Test Results

Type checking:

Run mypy for Python 3.10.................................................Passed

Unit tests:

======================= 109 passed, 22 skipped, 88 deselected, 493 warnings in 114.47s (0:01:54) =======================

Pre-commit hooks: All checks passed ✓

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the [Google Doc](https://docs.google.com/document/d/1YyVqrgX4gHTtrstbq8oWUImOyPCKSGnJ7xtTpmXzlRs/edit?tab=t.0).

gemini-code-assist

Code Review

This pull request is a great step towards improving the type safety and robustness of the attention modules. By enabling stricter mypy checking and fixing the resulting errors, you've eliminated several potential runtime crashes and silent correctness bugs. The changes, such as adding None checks, correcting keyword arguments, and improving type hints, are well-implemented. I've highlighted a few of the most critical fixes in my comments. Excellent work on enhancing the code quality!

vllm/v1/attention/backends/flash_attn.py

vllm/v1/attention/backends/mla/common.py

vllm/v1/attention/backends/mla/rocm_aiter_mla.py

vllm/v1/attention/backends/mla/aiter_triton_mla.py

vllm/v1/attention/backends/mla/rocm_aiter_mla.py

mergify · 2025-12-29T02:29:00Z

Hi @MrIceCreamMan, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

github-actions · 2025-12-29T02:38:57Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

yewentao256

Thanks for the work! Will take a look later

MrIceCreamMan · 2025-12-29T04:02:25Z

Here is why the CI is failing (according to Claude):

In CI:

vllm_flash_attn/*.py files are NOT in git (they're build artifacts)
When pre-commit runs, these files don't exist yet
mypy tries to analyze them but can't find the exports

On local machine:

I've installed vllm uv pip install -e ., which built and copied the vllm_flash_attn/*.py files
So mypy can find them

The Challenge

I could just force add git add -f vllm/vllm_flash_attn/flash_attn_interface.py and git add -f vllm/vllm_flash_attn/__init__.py
But they are ignored in .gitignore for good reasons

# vllm-flash-attn built from source
vllm/vllm_flash_attn/*

So a proper way of fixing this would be running mypy after vllm-flash-attn has been built
Some guidance please. Thank you very much.

yewentao256

The pre-commit fails for some reason, please take a look and fix manually if automatic one doesn't work in your env, also please take a look what Gemini suggests

mergify · 2026-01-01T18:08:07Z

Hi @MrIceCreamMan, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>

… the time of pre-commit on CI Signed-off-by: Zhuohao Yang <zy242@cornell.edu>

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>

mergify · 2026-01-03T17:21:14Z

Hi @MrIceCreamMan, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>

yewentao256

Thanks for the work! Let's run CI as well

vllm/v1/attention/backends/gdn_attn.py

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>

yewentao256

LGTM, thanks for the work!

tjtanaa · 2026-01-06T10:24:35Z

vllm/v1/attention/backends/flash_attn.py

              We use torch's .expand() to avoid duplicating values
        """
        assert output is not None, "Output tensor must be provided."
+        assert self.vllm_flash_attn_version is not None, (


Does mypy allow us to add this assertion after line in

vllm/vllm/v1/attention/backends/flash_attn.py

Line 557 in 3212278

self.vllm_flash_attn_version = get_flash_attn_version()

since this backend requires vllm_flash_attn to be installed? This way, we don't need to call it in every forward passes.

tjtanaa · 2026-01-06T10:26:26Z

vllm/v1/attention/backends/flash_attn.py

                )
                return output
            else:
+                sliding_window_size = (


Is it possible for use to do it in the __init__ function to avoid redundant calls every forward pass. Like after this line

vllm/vllm/v1/attention/backends/flash_attn.py

Line 546 in 3212278

self.sliding_window = (sliding_window - 1, 0)

tjtanaa · 2026-01-06T10:27:37Z

vllm/v1/attention/backends/flash_attn_diffkv.py

              We use torch's .expand() to avoid duplicating values
        """
        assert output is not None, "Output tensor must be provided."
+        assert self.vllm_flash_attn_version is not None, (


These lines are not needed after you move the assertion into FlashAttentionImpl

tjtanaa · 2026-01-06T10:28:02Z

vllm/v1/attention/backends/flash_attn_diffkv.py

                )
                return output
            else:
+                sliding_window_size = (


These lines are not needed after you move them into FlashAttentionImpl

vllmellm · 2026-01-06T11:51:14Z

vllm/v1/attention/backends/mla/aiter_triton_mla.py

            k,
            v,
            softmax_scale=softmax_scale,
-            return_lse=return_softmax_lse,


please keep return_lse=return_softmax_lse the gemini suggestion was wrong. rocm aiter backend the function is different and the arguments names are different.

vllmellm · 2026-01-06T11:52:43Z

vllm/v1/attention/backends/mla/rocm_aiter_mla.py

            k=k,
            v=v,
            softmax_scale=softmax_scale,
-            return_lse=return_softmax_lse,


please keep return_lse=return_softmax_lse the gemini suggestion was wrong. rocm aiter backend the function is different and the arguments names are different.

Signed-off-by: Naveenraj Kamalakannan <therealnaveenkamal@gmail.com>

…ound (vllm-project#31465) Signed-off-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>

…ound (vllm-project#31465) Signed-off-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

…ound (vllm-project#31465) Signed-off-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>

MrIceCreamMan requested review from LucasWilkinson, WoosukKwon, alexm-redhat, hmellor, mgoin, njhill, pavanimajety, tjtanaa, youkaichao and zhuohan123 as code owners December 29, 2025 02:20

mergify bot added nvidia rocm Related to AMD ROCm v1 labels Dec 29, 2025

github-project-automation bot added this to NVIDIA Dec 29, 2025

gemini-code-assist bot reviewed Dec 29, 2025

View reviewed changes

yewentao256 reviewed Dec 29, 2025

View reviewed changes

yewentao256 self-assigned this Dec 29, 2025

yewentao256 mentioned this pull request Dec 29, 2025

[Feature]: Fix all of the mypy check #26533

Open

19 tasks

yewentao256 reviewed Dec 29, 2025

View reviewed changes

MrIceCreamMan force-pushed the main branch from 95eced9 to 7ae91eb Compare January 1, 2026 18:03

MrIceCreamMan changed the title ~~fixed mypy warnings for files under vllm/v1/attention~~ fixed mypy warnings for files vllm/v1/attention with BAD PRACTICE Jan 1, 2026

MrIceCreamMan changed the title ~~fixed mypy warnings for files vllm/v1/attention with BAD PRACTICE~~ fixed mypy warnings for files vllm/v1/attention with TEMPORARY workaround Jan 1, 2026

Zhuohao Yang added 3 commits January 1, 2026 13:59

fixed mypy warnings for files under vllm/v1/attention

a2eecd1

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>

ignore vllm_flash_attn import errors because it is not fully built at…

0eabb31

… the time of pre-commit on CI Signed-off-by: Zhuohao Yang <zy242@cornell.edu>

fixed a type error

b8984b3

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>

MrIceCreamMan force-pushed the main branch from 8bd5532 to b8984b3 Compare January 1, 2026 19:00

ignore false redefinition errors

80c5e5c

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>

fix mypy error

fcad60c

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>

yewentao256 reviewed Jan 4, 2026

View reviewed changes

vllm/v1/attention/backends/gdn_attn.py Outdated Show resolved Hide resolved

yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 4, 2026

added assert to remove or 0

f133e57

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>

yewentao256 approved these changes Jan 4, 2026

View reviewed changes

github-project-automation bot moved this to Ready in NVIDIA Jan 4, 2026

yewentao256 and others added 2 commits January 5, 2026 10:22

Merge branch 'main' into main

eb5d923

Merge branch 'main' into main

d3c7bdb

DarkLight1337 enabled auto-merge (squash) January 5, 2026 20:14

Merge branch 'main' into main

3212278

tjtanaa reviewed Jan 6, 2026

View reviewed changes

vllmellm reviewed Jan 6, 2026

View reviewed changes

Merge branch 'main' into main

23c5574

DarkLight1337 merged commit 0a2c2dc into vllm-project:main Jan 7, 2026
55 checks passed

github-project-automation bot moved this from Ready to Done in NVIDIA Jan 7, 2026

vllmellm mentioned this pull request Jan 7, 2026

[ROCm][AITER] fix wrong argument passed to AITER flash_attn_varlen_func #31880

Merged

5 tasks

therealnaveenkamal added a commit to therealnaveenkamal/vllm that referenced this pull request Jan 7, 2026

fixes from vllm-project#31465

4a821ea

Signed-off-by: Naveenraj Kamalakannan <therealnaveenkamal@gmail.com>

MatthewBonanni mentioned this pull request Jan 8, 2026

[1/N][Attention] Restructure attention: move files #31916

Merged

5 tasks

hmellor mentioned this pull request Jan 27, 2026

[CI] Fix mypy for vllm/attention and vllm/compilation #26482

Closed

Uh oh!

Conversation

MrIceCreamMan commented Dec 29, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Implementation Notes

Dynamic Module Type Checking Workaround

Test Plan

Test Results

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Dec 29, 2025

Uh oh!

github-actions bot commented Dec 29, 2025

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

MrIceCreamMan commented Dec 29, 2025

Here is why the CI is failing (according to Claude):

In CI:

On local machine:

The Challenge

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Jan 1, 2026

Uh oh!

mergify bot commented Jan 3, 2026

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

tjtanaa Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tjtanaa Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tjtanaa Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

tjtanaa Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

vllmellm Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

vllmellm Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

MrIceCreamMan commented Dec 29, 2025 •

edited by github-actions bot

Loading

tjtanaa Jan 6, 2026 •

edited

Loading

tjtanaa Jan 6, 2026 •

edited

Loading