fixed mypy warnings for files vllm/v1/attention with TEMPORARY workaround#31465
fixed mypy warnings for files vllm/v1/attention with TEMPORARY workaround#31465DarkLight1337 merged 14 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request is a great step towards improving the type safety and robustness of the attention modules. By enabling stricter mypy checking and fixing the resulting errors, you've eliminated several potential runtime crashes and silent correctness bugs. The changes, such as adding None checks, correcting keyword arguments, and improving type hints, are well-implemented. I've highlighted a few of the most critical fixes in my comments. Excellent work on enhancing the code quality!
|
Hi @MrIceCreamMan, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
yewentao256
left a comment
There was a problem hiding this comment.
Thanks for the work! Will take a look later
Here is why the CI is failing (according to Claude):In CI:
On local machine:
The Challenge
|
yewentao256
left a comment
There was a problem hiding this comment.
The pre-commit fails for some reason, please take a look and fix manually if automatic one doesn't work in your env, also please take a look what Gemini suggests
|
Hi @MrIceCreamMan, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
… the time of pre-commit on CI Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
|
Hi @MrIceCreamMan, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
yewentao256
left a comment
There was a problem hiding this comment.
Thanks for the work! Let's run CI as well
Signed-off-by: Zhuohao Yang <zy242@cornell.edu>
yewentao256
left a comment
There was a problem hiding this comment.
LGTM, thanks for the work!
| We use torch's .expand() to avoid duplicating values | ||
| """ | ||
| assert output is not None, "Output tensor must be provided." | ||
| assert self.vllm_flash_attn_version is not None, ( |
There was a problem hiding this comment.
Does mypy allow us to add this assertion after line in
vllm/vllm/v1/attention/backends/flash_attn.py
Line 557 in 3212278
| ) | ||
| return output | ||
| else: | ||
| sliding_window_size = ( |
There was a problem hiding this comment.
Is it possible for use to do it in the __init__ function to avoid redundant calls every forward pass. Like after this line
vllm/vllm/v1/attention/backends/flash_attn.py
Line 546 in 3212278
| We use torch's .expand() to avoid duplicating values | ||
| """ | ||
| assert output is not None, "Output tensor must be provided." | ||
| assert self.vllm_flash_attn_version is not None, ( |
There was a problem hiding this comment.
These lines are not needed after you move the assertion into FlashAttentionImpl
| ) | ||
| return output | ||
| else: | ||
| sliding_window_size = ( |
There was a problem hiding this comment.
These lines are not needed after you move them into FlashAttentionImpl
| k, | ||
| v, | ||
| softmax_scale=softmax_scale, | ||
| return_lse=return_softmax_lse, |
There was a problem hiding this comment.
please keep return_lse=return_softmax_lse the gemini suggestion was wrong. rocm aiter backend the function is different and the arguments names are different.
| k=k, | ||
| v=v, | ||
| softmax_scale=softmax_scale, | ||
| return_lse=return_softmax_lse, |
There was a problem hiding this comment.
please keep return_lse=return_softmax_lse the gemini suggestion was wrong. rocm aiter backend the function is different and the arguments names are different.
Signed-off-by: Naveenraj Kamalakannan <therealnaveenkamal@gmail.com>
…ound (vllm-project#31465) Signed-off-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
…ound (vllm-project#31465) Signed-off-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
…ound (vllm-project#31465) Signed-off-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
…ound (vllm-project#31465) Signed-off-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Purpose
Follow-up to #26448
This PR improves type checking coverage for the attention module by:
vllm/v1/attentionfrom SEPARATE_GROUPS to FILES in mypy configurationThis is part of the effort to enable comprehensive type checking across the vLLM codebase.
Implementation Notes
Dynamic Module Type Checking Workaround
This PR includes
# type: ignore[attr-defined]annotations for imports from thevllm.vllm_flash_attnmodule. This is necessary becausevllm_flash_attnis a dynamically built C extension module that behaves differently in CI and local environments:Local environment: The module is compiled during build/installation, making all attributes available to mypy.
CI environment: Pre-commit hooks run on a fresh checkout before the build step, leaving
vllm_flash_attnas an empty stub, causing mypy to report missing attribute errors.Future improvement: The ideal solution would be to update the CI workflow to build extensions before running type checking:
This would eliminate the need for type ignore comments while maintaining comprehensive type checking coverage.
Test Plan
pytest tests/v1/attention/ -v -k "not (mla and (decode or mixed or spec_decode)) and not sparse_mla"Test Results
Type checking:
Unit tests:
Pre-commit hooks: All checks passed ✓
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.