Skip to content

Conversation

@nvmbreughe
Copy link
Contributor

@nvmbreughe nvmbreughe commented Nov 3, 2025

📌 Description

Updated decorator to support unspecified default. This was causing issues when calling mm_fp4 without backend specified.
Also added SM 110 as a supported backend on the cutlass backend (mm_fp4)

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

  • I have installed pre-commit by running pip install pre-commit (or used your preferred method).
  • I have installed the hooks with pre-commit install.
  • I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

  • Tests have been added or updated as needed.
  • All tests are passing (unittest, etc.).

Reviewer Notes

Summary by CodeRabbit

  • New Features

    • FP4 Cutlass GEMM now supports the SM110 GPU compute capability.
  • Bug Fixes

    • Kernels called without an explicit backend now consistently use the default backend.
  • Tests

    • Added a unit test to verify default backend selection and correct results when backend is omitted.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 3, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Allowed SM110 for the FP4 Cutlass GEMM requirement; refactored backend_requirement in flashinfer/utils.py to use inspect.signature, bind call-time args to kwargs with defaults, introduce _is_problem_size_supported, and change how backend/tensor are resolved; added a unit test for default-backend behavior.

Changes

Cohort / File(s) Summary
FP4 Cutlass GEMM capability
flashinfer/gemm.py
Extended the allowed SM capability list for the FP4 Cutlass GEMM requirement to include SM110.
Decorator refactor — argument binding & support checks
flashinfer/utils.py
Added import inspect. In backend_requirement: capture inspect.signature(func) as sig; bind call-time args/kwargs to the signature and apply defaults (collapsing positional args into kwargs for checks); added helper _is_problem_size_supported(**kwargs) that expects a bound backend in kwargs and reuses existing per-backend checks; locate tensor from the bound kwargs only; perform support validation using _is_problem_size_supported(**kwargs_with_defaults); preserve the public call signature by invoking the original function with the original *args, **kwargs. Comments added explaining the approach.
Tests — backend default behavior
tests/utils/test_decorators.py
Added test_backend_default_parameter() which verifies the decorator uses the module/default backend when backend is omitted and that explicitly passing backend overrides it; test determines device compute capability at runtime.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Caller
    participant Decorator as backend_requirement(wrapper)
    participant Binder as signature.bind + apply_defaults
    participant Support as _is_problem_size_supported
    participant Target as original function

    Caller->>Decorator: call decorated_func(...args, **kwargs)
    Decorator->>Binder: bind args/kwargs to signature (apply defaults)
    Binder-->>Decorator: bound_kwargs (positional args collapsed)
    Decorator->>Support: _is_problem_size_supported(**bound_kwargs)
    alt supported
        Decorator->>Target: call original func with original *args, **kwargs
        Target-->>Caller: return result
    else not supported
        Decorator-->>Caller: raise error (unsupported config)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Areas to focus:
    • flashinfer/utils.py — correctness of signature binding, defaults application, tensor selection from bound kwargs, and that the wrapped function is invoked with original *args, **kwargs.
    • flashinfer/gemm.py — verify SM capability list consistency and any related capability checks.
    • tests/utils/test_decorators.py — ensure the new test is robust across different CUDA devices/environments.

Poem

🐰 I bound the args and tidied the pack,

Defaults set snug, no slips in the stack.
SM110 hopped into the GEMM brigade,
A test proclaims the backend choice made,
Tiny hops, big smiles — refactor well played!

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: updating a decorator to support an unspecified default parameter, which directly addresses the core issue resolved in this PR.
Description check ✅ Passed The description covers the main changes (decorator update and SM110 support), indicates test status, and includes required checklist sections, though pre-commit checks are only partially completed.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 56d4c13 and 7401a6b.

📒 Files selected for processing (1)
  • flashinfer/utils.py (5 hunks)
🧰 Additional context used
🪛 Ruff (0.14.3)
flashinfer/utils.py

1033-1035: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Deploy Docs
🔇 Additional comments (4)
flashinfer/utils.py (4)

26-26: LGTM: Import addition supports signature binding.

The inspect import is necessary for the signature-based argument binding and default resolution implemented in the backend_requirement decorator.


954-955: LGTM: Efficient signature capture.

Capturing the function signature once and reusing it in the wrapper is a performance optimization that avoids redundant signature inspection on each function call.


978-990: LGTM: Helper function correctly validates backend and problem size.

The _is_problem_size_supported helper properly separates validation logic and expects the caller to have already bound arguments and applied defaults. The implementation correctly checks backend presence and applies both common and backend-specific checks.


1001-1037: Excellent implementation of signature-based argument resolution.

This implementation correctly addresses the issue with unspecified default values by:

  1. Binding arguments to signature (lines 1004-1007): Uses inspect.signature().bind() to resolve both positional and keyword arguments, then applies defaults. This ensures default backend values are available during validation.

  2. Unified validation (lines 1009-1032): All arguments are now in kwargs_with_defaults (including those originally passed as positional), enabling consistent backend detection and tensor argument discovery.

  3. Preserving call signature (line 1037): The original func(*args, **kwargs) call is maintained, ensuring the decorated function behaves identically to the original from the caller's perspective.

This approach resolves the issue where mm_fp4() would fail when called without an explicit backend argument, even though the function signature specified a default. Based on past review comments, this implementation correctly adopts the inspect.signature().bind() pattern suggested by gemini-code-assist[bot].


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @nvmbreughe, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the backend_requirement decorator to gracefully manage functions with default backend arguments, preventing errors when the backend is implicitly used. Additionally, it expands the supported compute capabilities for the mm_fp4 operation under the Cutlass backend, enabling broader hardware compatibility.

Highlights

  • Decorator Enhancement: The backend_requirement decorator has been updated to correctly identify and utilize default values for the backend argument in decorated functions, resolving issues where the backend was not explicitly specified.
  • Compute Capability Expansion: Support for SM 110 compute capability has been added to the _cutlass_gemm_fp4_requirement for the mm_fp4 operation, increasing hardware compatibility.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the backend_requirement decorator to correctly handle default values for the backend parameter and adds support for SM 110 compute capability to the cutlass backend for mm_fp4. The change to the decorator, however, does not correctly handle positional arguments for the backend parameter. I've suggested a more robust implementation using inspect.signature().bind() to correctly resolve the backend argument regardless of whether it's passed as a positional or keyword argument.

Comment on lines 954 to 965
def get_backend(args, kwargs):
# backend may not be specified, but could have a default value
sig = inspect.signature(func)
backend_parameter = sig.parameters.get("backend")
if (
backend_parameter
and backend_parameter.default != inspect.Parameter.empty
):
backend = kwargs.get("backend", backend_parameter.default)
else:
backend = kwargs.get("backend")
return backend
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The new get_backend function is a good improvement to handle default values for the backend parameter. However, it only considers keyword arguments (kwargs) and ignores positional arguments (args), which are passed to it but not used. This can lead to incorrect behavior when backend is provided as a positional argument.

A more robust approach is to use inspect.signature().bind() to correctly resolve the backend argument from both positional and keyword arguments, while also handling default values gracefully. This ensures the correct backend is always identified.

Suggested change
def get_backend(args, kwargs):
# backend may not be specified, but could have a default value
sig = inspect.signature(func)
backend_parameter = sig.parameters.get("backend")
if (
backend_parameter
and backend_parameter.default != inspect.Parameter.empty
):
backend = kwargs.get("backend", backend_parameter.default)
else:
backend = kwargs.get("backend")
return backend
def get_backend(args, kwargs):
# backend may not be specified, but could have a default value
sig = inspect.signature(func)
try:
bound_args = sig.bind(*args, **kwargs)
bound_args.apply_defaults()
return bound_args.arguments.get("backend")
except TypeError:
# Fallback for safety, though it's unlikely to be needed if the call is valid.
return kwargs.get("backend")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the function signature does not indicate a default backend, get_backend will return None. We should handle this edge case in the rest of the decorators, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed with apply_defaults

Copy link
Collaborator

@bkryu bkryu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nvmbreughe, I think the changes are straightforward, but we may want to add unit tests to check the unspecified backend case. Can we add these?

@wenscarl
Copy link
Collaborator

wenscarl commented Nov 3, 2025

Verified by not providing "backend" to mm_fp4. LGTM. Thanks for the quick fix!

Copy link
Collaborator

@bkryu bkryu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the unit tests. LTGM!

Copy link
Contributor

@jimmyzho jimmyzho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nvmbreughe nvmbreughe merged commit 1e75bff into flashinfer-ai:main Nov 3, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants