Skip to content

[Misc][BE] Type coverage for vllm/compilation [3/3]#31748

Merged
zou3519 merged 3 commits intovllm-project:mainfrom
Lucaskabela:lucaskabela/compilation_type_coverage_3
Jan 12, 2026
Merged

[Misc][BE] Type coverage for vllm/compilation [3/3]#31748
zou3519 merged 3 commits intovllm-project:mainfrom
Lucaskabela:lucaskabela/compilation_type_coverage_3

Conversation

@Lucaskabela
Copy link
Copy Markdown
Contributor

@Lucaskabela Lucaskabela commented Jan 5, 2026

Purpose

We want to provide better type hint coverage in vllm/compilation to improve maintainability, readability, and reduce silent errors

This PR should be applied on top of #31744

Test Plan

mypy vllm/compilation

Test Result

Success: no issues found in 28 source files

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Note

Improves type coverage and API clarity across compilation passes with minimal logic adjustments.

  • Add explicit return/param types, ParamSpec, and helper get_inputs methods across fusion passes (activation, RMSNorm, attention, collective, sequence parallelism, ROCm/AIter)
  • Tighten function signatures (__init__, register, __call__, uuid, helpers like empty_*) and use precise tuple/list return types in pattern/replacement fns
  • Safer device capability handling and minor no-op cleanup hooks; unify tracing wrappers (wrap_trace_fn), reshape conversions, and first-return-only helpers with typing
  • Update decorators to import SourceInfo, mark dynamic/unbacked dims conditionally, and patch configs with typed contexts
  • Minor typing fixes in distributed parallel_state, rotary embedding __all__ and cache key types

mypy: Success on 28 files (no issues).

Written by Cursor Bugbot for commit 06e6343b18cf59113c80149a18f93b7ceeb988ac. This will update automatically on new commits. Configure here.


Note

Improves type coverage and API clarity across vllm/compilation with minimal logic changes.

  • Add explicit return/param types (incl. ParamSpec) and tighten signatures for __init__, register, __call__, uuid, and helper fns; pattern/replacement fns now return precise tuples
  • Introduce typed get_inputs() helpers for patterns and unify tracing via wrap_trace_fn, view→reshape conversions, and no-op permute cleanup
  • Safer FlashInfer handling (device capability may be None), workspace setup, and one-shot size checks; minor no-op cleanup hook in sequence parallelism
  • Small typing fixes in distributed/parallel_state and rotary_embedding (__all__, cache key types)

Written by Cursor Bugbot for commit 78350f9. This will update automatically on new commits. Configure here.

@mergify mergify bot added the nvidia label Jan 5, 2026
@Lucaskabela Lucaskabela force-pushed the lucaskabela/compilation_type_coverage_3 branch from 08aa1b1 to c102a4b Compare January 5, 2026 21:59
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request focuses on improving type hint coverage across the vllm/compilation module, which is a valuable contribution to code quality and maintainability. The changes are extensive and well-executed, adding type hints to many functions and methods. I've identified one critical issue in vllm/compilation/collective_fusion.py where a replacement function in a pattern matcher returns a list of tensors instead of a single tensor, which will likely lead to a runtime error. Please address this issue.

@Lucaskabela Lucaskabela changed the title [BE] Type coverage for vllm/compilation [3/3] [Misc][BE] Type coverage for vllm/compilation [3/3] Jan 5, 2026
@mergify
Copy link
Copy Markdown

mergify bot commented Jan 8, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @Lucaskabela.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Jan 8, 2026
@Lucaskabela Lucaskabela force-pushed the lucaskabela/compilation_type_coverage_3 branch from c102a4b to 06e6343 Compare January 10, 2026 00:57
@mergify mergify bot removed the needs-rebase label Jan 10, 2026
@Lucaskabela Lucaskabela marked this pull request as ready for review January 10, 2026 00:59
Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
@Lucaskabela Lucaskabela force-pushed the lucaskabela/compilation_type_coverage_3 branch from 06e6343 to 78350f9 Compare January 12, 2026 16:01
@github-project-automation github-project-automation bot moved this to Ready in NVIDIA Jan 12, 2026
@zou3519 zou3519 added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 12, 2026
@zou3519 zou3519 enabled auto-merge (squash) January 12, 2026 17:41
@zou3519
Copy link
Copy Markdown
Collaborator

zou3519 commented Jan 12, 2026

If someone asks: this PR does refactor get_inputs to be a methods on some of these classes but doesn't require it in the base class

Comment on lines 33 to 46
if find_spec("flashinfer"):
try:
import flashinfer.comm as flashinfer_comm

flashinfer_comm = (
flashinfer_comm: ModuleType | None = ( # type: ignore[no-redef]
flashinfer_comm
if hasattr(flashinfer_comm, "trtllm_allreduce_fusion")
else None
)
except ImportError:
flashinfer_comm = None
flashinfer_comm = None # type: ignore[assignment]
else:
flashinfer_comm = None
flashinfer_comm = None # type: ignore[assignment]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just set flashinfer_comm=None at the start and avoid the type: ignores?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we set flashinfer_comm = None above this logic, we still need to add ignores for redef on L35 and L37 so not sure if it would save us much for code cleanliness

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that would just be reassignment not definition? Also I see the import is the same as the var, let's rename the import to _flashinfer_comm?

@zou3519 zou3519 merged commit ad8818b into vllm-project:main Jan 12, 2026
61 checks passed
@github-project-automation github-project-automation bot moved this from Ready to Done in NVIDIA Jan 12, 2026
TomerBN-Nvidia pushed a commit to TomerBN-Nvidia/vllm that referenced this pull request Jan 13, 2026
Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
Signed-off-by: Tomer Natan <tbarnatan@computelab-frontend-8.nvidia.com>
sammysun0711 pushed a commit to sammysun0711/vllm that referenced this pull request Jan 16, 2026
Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
_ROPE_DICT: dict[tuple, RotaryEmbedding] = {}
_ROPE_DICT: dict[tuple[Any, ...], RotaryEmbedding] = {}

__all__ = ["RotaryEmbedding"]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

@Lucaskabela Lucaskabela deleted the lucaskabela/compilation_type_coverage_3 branch February 19, 2026 16:40
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

nvidia ready ONLY add when PR is ready to merge/full CI is needed

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants