Skip to content

[sgl-kernel] chore: clean pytorch_extension_utils.h#13343

Closed
jinyouzhi wants to merge 3 commits intosgl-project:mainfrom
jinyouzhi:clean
Closed

[sgl-kernel] chore: clean pytorch_extension_utils.h#13343
jinyouzhi wants to merge 3 commits intosgl-project:mainfrom
jinyouzhi:clean

Conversation

@jinyouzhi
Copy link
Copy Markdown
Contributor

@jinyouzhi jinyouzhi commented Nov 15, 2025

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @jinyouzhi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on a significant cleanup and refactoring of header dependencies within the sgl-kernel/csrc directory. The primary objective is to consolidate utility functions under a more generic utils.h header, moving away from the pytorch_extension_utils.h specific header. This change also necessitated adding direct includes for core CUDA and ATen functionalities in several files, ensuring that all required declarations are explicitly present. Furthermore, a minor but important improvement was made to utils.h by converting a preprocessor macro into a constexpr variable, enhancing code robustness and clarity.

Highlights

  • Header Refactoring: Replaced pytorch_extension_utils.h with utils.h across ten CUDA source files, streamlining header dependencies and promoting a more generic utility structure.
  • Direct CUDA Includes: Introduced explicit includes for <ATen/cuda/CUDAContext.h>, <cuda_runtime.h>, and <ATen/cuda/CUDADataType.h> in several files where pytorch_extension_utils.h was removed, ensuring necessary CUDA functionalities are properly declared.
  • Constant Modernization: Updated the FULL_MASK macro in utils.h to a constexpr uint32_t FULL_32BIT_MASK, improving type safety and enabling compile-time evaluation.
  • Usage Update: Modified the warpReduceMax function in utils.h to utilize the newly defined FULL_32BIT_MASK constant, aligning with the constant modernization.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a nice cleanup, refactoring multiple files to use the new utils.h header instead of the old pytorch_extension_utils.h. This consolidation improves code organization. The change from a #define macro to a constexpr variable for FULL_32BIT_MASK is also a good improvement for type safety and modern C++ practices.

I've added one comment to improve portability by using an existing abstraction macro, which will make the code compatible with ROCm.

@FlamingoPg
Copy link
Copy Markdown
Collaborator

Thats right

@FlamingoPg
Copy link
Copy Markdown
Collaborator

Btw, If we keep pytorch_extension_utils, will we encounter any errors? Could you explain in detail what reasons we have for modifying it?

@jinyouzhi
Copy link
Copy Markdown
Contributor Author

jinyouzhi commented Nov 18, 2025

Btw, If we keep pytorch_extension_utils, will we encounter any errors? Could you explain in detail what reasons we have for modifying it?

Goog point, the direct motivation is I found no DISPATCH_DLPACK_DTYPE_TO_CTYPE_FLOAT_FP16 in pytorch_extension_utils.h of flashinfer when work for #13181. (these two has too many conflicts)

@jinyouzhi
Copy link
Copy Markdown
Contributor Author

jinyouzhi commented Nov 18, 2025

Just check again, now sgl-kernel still dep on https://github.com/flashinfer-ai/flashinfer/blob/bc29697ba20b7e6bdb728ded98f04788e16ee021/csrc/pytorch_extension_utils.h in

GIT_TAG bc29697ba20b7e6bdb728ded98f04788e16ee021

So compilation is ok for now.
But it has been refacted in flashinfer-ai/flashinfer#1795

@jinyouzhi jinyouzhi changed the title clean pytorch_extension_utils.h [sgl-kernel] chore: clean pytorch_extension_utils.h Nov 18, 2025
@jinyouzhi
Copy link
Copy Markdown
Contributor Author

@FlamingoPg @Fridge003 Could you give further comments? I think we need to clean these if flashinfer dependency move forward.

@FlamingoPg
Copy link
Copy Markdown
Collaborator

@jinyouzhi When using flashinfer in JIT mode, we install it directly rather than compiling through CMakeLists.txt. In sgl-kernel, we have pinned a specific flashinfer version at https://github.com/yzh119/flashinfer-dev/blob/ce7c58e4e98ed4fb3aac6626c30c792317f23a5f/csrc/pytorch_extension_utils.h. Currently, we have no plans to upgrade flashinfer in sgl-kernel for the time being.

@FlamingoPg
Copy link
Copy Markdown
Collaborator

We tagging flashinfer version by:

# flashinfer
FetchContent_Declare(
    repo-flashinfer
    GIT_REPOSITORY https://github.com/flashinfer-ai/flashinfer.git
    GIT_TAG        bc29697ba20b7e6bdb728ded98f04788e16ee021
    GIT_SHALLOW    OFF
)
FetchContent_Populate(repo-flashinfer)

@FlamingoPg
Copy link
Copy Markdown
Collaborator

Finally If you think it's necessary to upgrade, please tell me the purpose of doing so. And you need to ensure the compilation passes.

@jinyouzhi
Copy link
Copy Markdown
Contributor Author

jinyouzhi commented Nov 25, 2025

@jinyouzhi When using flashinfer in JIT mode, we install it directly rather than compiling through CMakeLists.txt. In sgl-kernel, we have pinned a specific flashinfer version at https://github.com/yzh119/flashinfer-dev/blob/ce7c58e4e98ed4fb3aac6626c30c792317f23a5f/csrc/pytorch_extension_utils.h. Currently, we have no plans to upgrade flashinfer in sgl-kernel for the time being.

Got it, thank you for your patient explanation. I agree it's lack of motivation to modify if not necessary.

@jinyouzhi jinyouzhi closed this Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Shall we clean or modify #include "pytorch_extension_utils.h" in kernels?

2 participants