[Kernel] Add JIT activation by weimin023 · Pull Request #18401 · sgl-project/sglang

weimin023 · 2026-02-07T07:38:16Z

Motivation

Add JIT-compiled CUDA kernels for activation function

Modifications

Migrate the activation kernels from ahead-of-time (AOT) compilation in sgl_kernel to the JIT compilation framework under python/sglang/jit_kernel/
Port the CUDA source file (activation.cu) into python/sglang/jit_kernel/csrc/elementwise/activation.cuh with minimal modifications
Add a Python wrapper (python/sglang/jit_kernel/activation.py)
Add comprehensive correctness tests (python/sglang/jit_kernel/tests/test_activation.py) to verify JIT kernels with Torch results

Accuracy Tests

pytest /sgl-workspace/sglang/python/sglang/jit_kernel/tests/test_activation.py

platform linux -- Python 3.12.3, pytest-9.0.2, pluggy-1.6.0
rootdir: /sgl-workspace/sglang/python
configfile: pyproject.toml
plugins: anyio-4.12.1, typeguard-4.4.4
collected 48 items

python/sglang/jit_kernel/tests/test_activation.py ................................................ [100%]

=================================================================================================================== 48 passed in 29.37s ====================================================================================================================

Benchmarking and Profiling

Test the accuracy:
python3 -m sglang.test.few_shot_gsm8k --num-questions 200

Accuracy: 0.820
Invalid: 0.000
Latency: 10.294 s
Output throughput: 2829.049 token/s

Benchmark the speed:

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review Process

Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
After green CI and required approvals, ask Merge Oncalls to merge.

gemini-code-assist · 2026-02-07T07:38:20Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

gemini-code-assist · 2026-02-11T15:43:05Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

weimin023 · 2026-02-11T15:44:55Z

Hi @BBuf, could you please take a look and let me know which benchmark I should run?
Thanks!

python/sglang/jit_kernel/csrc/elementwise/activation.cuh

sgl-kernel/setup_rocm.py

DarkSharpness · 2026-02-15T06:54:04Z

3rdparty/amd/sgl-kernel/rocm_hipify.py

    "csrc/allreduce/custom_all_reduce.hip",
    "csrc/allreduce/deterministic_all_reduce.hip",
    "csrc/allreduce/quick_all_reduce.cu",
    "csrc/common_extension_rocm.cc",


JIT kernel has not support rocm yet. Maybe just keep the original HIP code first?

Hi @DarkSharpness, I've kept the HIP code and added a _IS_ROCM guard to allow non-NVIDIA GPUs to use the original AOT kernel. Please let me know if this meets the requirements.

DarkSharpness · 2026-04-03T15:30:36Z

@weimin023 Thanks for contribution! I adapted from this PR in #21766 (which should be a super-set of this PR) and added you as co-author. Feel free to reopen the PR if something is still missing

github-actions bot added amd sgl-kernel labels Feb 7, 2026

weimin023 force-pushed the jit-activation branch from fc6c62a to 5809b33 Compare February 11, 2026 15:41

weimin023 marked this pull request as ready for review February 11, 2026 15:43

weimin023 requested review from BBuf, DarkSharpness, FlamingoPg, HaiShaw, ispobock, merrymercy, yizhang2077 and zhyncs as code owners February 11, 2026 15:43

DarkSharpness reviewed Feb 11, 2026

View reviewed changes

python/sglang/jit_kernel/csrc/elementwise/activation.cuh Outdated Show resolved Hide resolved

DarkSharpness reviewed Feb 11, 2026

View reviewed changes

sgl-kernel/setup_rocm.py Show resolved Hide resolved

weimin023 force-pushed the jit-activation branch from a5c2bae to 056d8d1 Compare February 14, 2026 14:23

weimin023 requested a review from DarkSharpness February 15, 2026 06:25

DarkSharpness reviewed Feb 15, 2026

View reviewed changes

weimin023 added 6 commits February 22, 2026 17:16

init

0dfc3c7

basic impl

6ac0d09

update

0eb823c

pre-commit

f492226

update

7b6589b

keep HIP code

ddb5860

weimin023 force-pushed the jit-activation branch from 3aeb599 to ddb5860 Compare February 22, 2026 09:16

add _IS_ROCM guard

d39d176

weimin023 requested a review from DarkSharpness February 24, 2026 03:39

DarkSharpness closed this Apr 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Kernel] Add JIT activation#18401

[Kernel] Add JIT activation#18401
weimin023 wants to merge 7 commits intosgl-project:mainfrom
weimin023:jit-activation

weimin023 commented Feb 7, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Feb 7, 2026

Uh oh!

gemini-code-assist bot commented Feb 11, 2026

Uh oh!

weimin023 commented Feb 11, 2026

Uh oh!

Uh oh!

Uh oh!

DarkSharpness Feb 15, 2026

Uh oh!

weimin023 Feb 22, 2026

Uh oh!

DarkSharpness commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

weimin023 commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

Uh oh!

gemini-code-assist bot commented Feb 7, 2026

Uh oh!

gemini-code-assist bot commented Feb 11, 2026

Uh oh!

weimin023 commented Feb 11, 2026

Uh oh!

Uh oh!

Uh oh!

DarkSharpness Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

weimin023 Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

DarkSharpness commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

weimin023 commented Feb 7, 2026 •

edited

Loading