[sgl-kernel] chore: clean `pytorch_extension_utils.h` by jinyouzhi · Pull Request #13343 · sgl-project/sglang

jinyouzhi · 2025-11-15T18:37:51Z

Motivation

Fix Shall we clean or modify #include "pytorch_extension_utils.h" in kernels? #13209

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

gemini-code-assist · 2025-11-15T18:38:19Z

Summary of Changes

Hello @jinyouzhi, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on a significant cleanup and refactoring of header dependencies within the sgl-kernel/csrc directory. The primary objective is to consolidate utility functions under a more generic utils.h header, moving away from the pytorch_extension_utils.h specific header. This change also necessitated adding direct includes for core CUDA and ATen functionalities in several files, ensuring that all required declarations are explicitly present. Furthermore, a minor but important improvement was made to utils.h by converting a preprocessor macro into a constexpr variable, enhancing code robustness and clarity.

Highlights

Header Refactoring: Replaced pytorch_extension_utils.h with utils.h across ten CUDA source files, streamlining header dependencies and promoting a more generic utility structure.
Direct CUDA Includes: Introduced explicit includes for <ATen/cuda/CUDAContext.h>, <cuda_runtime.h>, and <ATen/cuda/CUDADataType.h> in several files where pytorch_extension_utils.h was removed, ensuring necessary CUDA functionalities are properly declared.
Constant Modernization: Updated the FULL_MASK macro in utils.h to a constexpr uint32_t FULL_32BIT_MASK, improving type safety and enabling compile-time evaluation.
Usage Update: Modified the warpReduceMax function in utils.h to utilize the newly defined FULL_32BIT_MASK constant, aligning with the constant modernization.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request is a nice cleanup, refactoring multiple files to use the new utils.h header instead of the old pytorch_extension_utils.h. This consolidation improves code organization. The change from a #define macro to a constexpr variable for FULL_32BIT_MASK is also a good improvement for type safety and modern C++ practices.

I've added one comment to improve portability by using an existing abstraction macro, which will make the code compatible with ROCm.

sgl-kernel/include/utils.h

FlamingoPg · 2025-11-18T06:18:45Z

Thats right

FlamingoPg · 2025-11-18T06:21:00Z

Btw, If we keep pytorch_extension_utils, will we encounter any errors? Could you explain in detail what reasons we have for modifying it?

jinyouzhi · 2025-11-18T06:31:34Z

Btw, If we keep pytorch_extension_utils, will we encounter any errors? Could you explain in detail what reasons we have for modifying it?

Goog point, the direct motivation is I found no DISPATCH_DLPACK_DTYPE_TO_CTYPE_FLOAT_FP16 in pytorch_extension_utils.h of flashinfer when work for #13181. (these two has too many conflicts)

jinyouzhi · 2025-11-18T06:42:56Z

Just check again, now sgl-kernel still dep on https://github.com/flashinfer-ai/flashinfer/blob/bc29697ba20b7e6bdb728ded98f04788e16ee021/csrc/pytorch_extension_utils.h in

sglang/sgl-kernel/CMakeLists.txt

Line 86 in 67071f5

GIT_TAG bc29697ba20b7e6bdb728ded98f04788e16ee021

So compilation is ok for now.
But it has been refacted in flashinfer-ai/flashinfer#1795

jinyouzhi · 2025-11-24T13:28:39Z

@FlamingoPg @Fridge003 Could you give further comments? I think we need to clean these if flashinfer dependency move forward.

FlamingoPg · 2025-11-25T10:30:40Z

@jinyouzhi When using flashinfer in JIT mode, we install it directly rather than compiling through CMakeLists.txt. In sgl-kernel, we have pinned a specific flashinfer version at https://github.com/yzh119/flashinfer-dev/blob/ce7c58e4e98ed4fb3aac6626c30c792317f23a5f/csrc/pytorch_extension_utils.h. Currently, we have no plans to upgrade flashinfer in sgl-kernel for the time being.

FlamingoPg · 2025-11-25T10:31:23Z

We tagging flashinfer version by:

# flashinfer
FetchContent_Declare(
    repo-flashinfer
    GIT_REPOSITORY https://github.com/flashinfer-ai/flashinfer.git
    GIT_TAG        bc29697ba20b7e6bdb728ded98f04788e16ee021
    GIT_SHALLOW    OFF
)
FetchContent_Populate(repo-flashinfer)

FlamingoPg · 2025-11-25T10:32:41Z

Finally If you think it's necessary to upgrade, please tell me the purpose of doing so. And you need to ensure the compilation passes.

jinyouzhi · 2025-11-25T10:55:14Z

@jinyouzhi When using flashinfer in JIT mode, we install it directly rather than compiling through CMakeLists.txt. In sgl-kernel, we have pinned a specific flashinfer version at https://github.com/yzh119/flashinfer-dev/blob/ce7c58e4e98ed4fb3aac6626c30c792317f23a5f/csrc/pytorch_extension_utils.h. Currently, we have no plans to upgrade flashinfer in sgl-kernel for the time being.

Got it, thank you for your patient explanation. I agree it's lack of motivation to modify if not necessary.

clean pytorch_extension_utils.h

e74b86c

jinyouzhi requested review from BBuf, FlamingoPg, HaiShaw, ispobock, merrymercy, yizhang2077 and zhyncs as code owners November 15, 2025 18:37

github-actions bot added speculative-decoding sgl-kernel labels Nov 15, 2025

gemini-code-assist bot reviewed Nov 15, 2025

View reviewed changes

sgl-kernel/include/utils.h Show resolved Hide resolved

FlamingoPg added run-ci and removed speculative-decoding labels Nov 18, 2025

jinyouzhi changed the title ~~clean pytorch_extension_utils.h~~ [sgl-kernel] chore: clean pytorch_extension_utils.h Nov 18, 2025

Merge branch 'main' into clean

0ab0ff9

github-actions bot added the speculative-decoding label Nov 20, 2025

Merge branch 'main' into clean

1804e40

jinyouzhi closed this Nov 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[sgl-kernel] chore: clean `pytorch_extension_utils.h`#13343

[sgl-kernel] chore: clean `pytorch_extension_utils.h`#13343
jinyouzhi wants to merge 3 commits intosgl-project:mainfrom
jinyouzhi:clean

jinyouzhi commented Nov 15, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Nov 15, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

FlamingoPg commented Nov 18, 2025

Uh oh!

FlamingoPg commented Nov 18, 2025

Uh oh!

jinyouzhi commented Nov 18, 2025 •

edited

Loading

Uh oh!

jinyouzhi commented Nov 18, 2025 •

edited

Loading

Uh oh!

jinyouzhi commented Nov 24, 2025

Uh oh!

FlamingoPg commented Nov 25, 2025

Uh oh!

FlamingoPg commented Nov 25, 2025

Uh oh!

FlamingoPg commented Nov 25, 2025

Uh oh!

jinyouzhi commented Nov 25, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jinyouzhi commented Nov 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

gemini-code-assist bot commented Nov 15, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

FlamingoPg commented Nov 18, 2025

Uh oh!

FlamingoPg commented Nov 18, 2025

Uh oh!

jinyouzhi commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jinyouzhi commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jinyouzhi commented Nov 24, 2025

Uh oh!

FlamingoPg commented Nov 25, 2025

Uh oh!

FlamingoPg commented Nov 25, 2025

Uh oh!

FlamingoPg commented Nov 25, 2025

Uh oh!

jinyouzhi commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jinyouzhi commented Nov 15, 2025 •

edited

Loading

jinyouzhi commented Nov 18, 2025 •

edited

Loading

jinyouzhi commented Nov 18, 2025 •

edited

Loading

jinyouzhi commented Nov 25, 2025 •

edited

Loading