Skip to content

fixed mypy warnings for files vllm/v1/attention with TEMPORARY workaround#31465

Merged
DarkLight1337 merged 14 commits intovllm-project:mainfrom
MrIceCreamMan:main
Jan 7, 2026
Merged

fixed mypy warnings for files vllm/v1/attention with TEMPORARY workaround#31465
DarkLight1337 merged 14 commits intovllm-project:mainfrom
MrIceCreamMan:main

Conversation

@MrIceCreamMan
Copy link
Copy Markdown
Contributor

@MrIceCreamMan MrIceCreamMan commented Dec 29, 2025

Purpose

Follow-up to #26448

This PR improves type checking coverage for the attention module by:

  • Moving vllm/v1/attention from SEPARATE_GROUPS to FILES in mypy configuration
  • Fixing type errors in the attention module to enable strict type checking
  • This ensures better code quality and catches type-related bugs earlier

This is part of the effort to enable comprehensive type checking across the vLLM codebase.

Implementation Notes

Dynamic Module Type Checking Workaround

This PR includes # type: ignore[attr-defined] annotations for imports from the vllm.vllm_flash_attn module. This is necessary because vllm_flash_attn is a dynamically built C extension module that behaves differently in CI and local environments:

Local environment: The module is compiled during build/installation, making all attributes available to mypy.

CI environment: Pre-commit hooks run on a fresh checkout before the build step, leaving vllm_flash_attn as an empty stub, causing mypy to report missing attribute errors.

Future improvement: The ideal solution would be to update the CI workflow to build extensions before running type checking:

- name: Build vLLM extensions
  run: python setup.py build_ext --inplace

- name: Run pre-commit
  run: pre-commit run --all-files

This would eliminate the need for type ignore comments while maintaining comprehensive type checking coverage.

Test Plan

  1. Type checking:
pre-commit run --hook-stage manual mypy-3.10 -a
  1. Unit tests for attention module:
pytest tests/v1/attention/ -v -k "not (mla and (decode or mixed or spec_decode)) and not sparse_mla"
  1. Pre-commit checks:
rm -rf .mypy_cache/
pre-commit clean
pre-commit run --all-files --hook-stage manual

Test Results

Type checking:

Run mypy for Python 3.10.................................................Passed

Unit tests:

======================= 109 passed, 22 skipped, 88 deselected, 493 warnings in 114.47s (0:01:54) =======================

Pre-commit hooks: All checks passed ✓


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the [Google Doc](https://docs.google.com/document/d/1YyVqrgX4gHTtrstbq8oWUImOyPCKSGnJ7xtTpmXzlRs/edit?tab=t.0).

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a great step towards improving the type safety and robustness of the attention modules. By enabling stricter mypy checking and fixing the resulting errors, you've eliminated several potential runtime crashes and silent correctness bugs. The changes, such as adding None checks, correcting keyword arguments, and improving type hints, are well-implemented. I've highlighted a few of the most critical fixes in my comments. Excellent work on enhancing the code quality!

@mergify
Copy link
Copy Markdown

mergify bot commented Dec 29, 2025

Hi @MrIceCreamMan, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

@github-actions
Copy link
Copy Markdown

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

Copy link
Copy Markdown
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work! Will take a look later

@yewentao256 yewentao256 self-assigned this Dec 29, 2025
@MrIceCreamMan
Copy link
Copy Markdown
Contributor Author

Here is why the CI is failing (according to Claude):

In CI:

  1. vllm_flash_attn/*.py files are NOT in git (they're build artifacts)
  2. When pre-commit runs, these files don't exist yet
  3. mypy tries to analyze them but can't find the exports

On local machine:

  1. I've installed vllm uv pip install -e ., which built and copied the vllm_flash_attn/*.py files
  2. So mypy can find them

The Challenge

  1. I could just force add git add -f vllm/vllm_flash_attn/flash_attn_interface.py and git add -f vllm/vllm_flash_attn/__init__.py
  2. But they are ignored in .gitignore for good reasons
# vllm-flash-attn built from source
vllm/vllm_flash_attn/*
  1. So a proper way of fixing this would be running mypy after vllm-flash-attn has been built
  2. Some guidance please. Thank you very much.

Copy link
Copy Markdown
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pre-commit fails for some reason, please take a look and fix manually if automatic one doesn't work in your env, also please take a look what Gemini suggests

@MrIceCreamMan MrIceCreamMan changed the title fixed mypy warnings for files under vllm/v1/attention fixed mypy warnings for files vllm/v1/attention with BAD PRACTICE Jan 1, 2026
@mergify
Copy link
Copy Markdown

mergify bot commented Jan 1, 2026

Hi @MrIceCreamMan, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

@MrIceCreamMan MrIceCreamMan changed the title fixed mypy warnings for files vllm/v1/attention with BAD PRACTICE fixed mypy warnings for files vllm/v1/attention with TEMPORARY workaround Jan 1, 2026
Zhuohao Yang added 3 commits January 1, 2026 13:59
Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
… the time of pre-commit on CI

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
@mergify
Copy link
Copy Markdown

mergify bot commented Jan 3, 2026

Hi @MrIceCreamMan, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
Copy link
Copy Markdown
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work! Let's run CI as well

@yewentao256 yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 4, 2026
Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
Copy link
Copy Markdown
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the work!

@github-project-automation github-project-automation bot moved this to Ready in NVIDIA Jan 4, 2026
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) January 5, 2026 20:14
We use torch's .expand() to avoid duplicating values
"""
assert output is not None, "Output tensor must be provided."
assert self.vllm_flash_attn_version is not None, (
Copy link
Copy Markdown
Collaborator

@tjtanaa tjtanaa Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does mypy allow us to add this assertion after line in

self.vllm_flash_attn_version = get_flash_attn_version()
since this backend requires vllm_flash_attn to be installed? This way, we don't need to call it in every forward passes.

)
return output
else:
sliding_window_size = (
Copy link
Copy Markdown
Collaborator

@tjtanaa tjtanaa Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for use to do it in the __init__ function to avoid redundant calls every forward pass. Like after this line

self.sliding_window = (sliding_window - 1, 0)

We use torch's .expand() to avoid duplicating values
"""
assert output is not None, "Output tensor must be provided."
assert self.vllm_flash_attn_version is not None, (
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are not needed after you move the assertion into FlashAttentionImpl

)
return output
else:
sliding_window_size = (
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are not needed after you move them into FlashAttentionImpl

k,
v,
softmax_scale=softmax_scale,
return_lse=return_softmax_lse,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep return_lse=return_softmax_lse the gemini suggestion was wrong. rocm aiter backend the function is different and the arguments names are different.

k=k,
v=v,
softmax_scale=softmax_scale,
return_lse=return_softmax_lse,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep return_lse=return_softmax_lse the gemini suggestion was wrong. rocm aiter backend the function is different and the arguments names are different.

@DarkLight1337 DarkLight1337 merged commit 0a2c2dc into vllm-project:main Jan 7, 2026
55 checks passed
@github-project-automation github-project-automation bot moved this from Ready to Done in NVIDIA Jan 7, 2026
therealnaveenkamal added a commit to therealnaveenkamal/vllm that referenced this pull request Jan 7, 2026
Signed-off-by: Naveenraj Kamalakannan <therealnaveenkamal@gmail.com>
yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026
…ound (vllm-project#31465)

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
Co-authored-by: Zhuohao Yang <zy242@cornell.edu>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
…ound (vllm-project#31465)

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
Co-authored-by: Zhuohao Yang <zy242@cornell.edu>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
…ound (vllm-project#31465)

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
Co-authored-by: Zhuohao Yang <zy242@cornell.edu>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
…ound (vllm-project#31465)

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
Co-authored-by: Zhuohao Yang <zy242@cornell.edu>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

nvidia ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm v1

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants