[Feature] JIT activation and update skills (by codex)#21766
[Feature] JIT activation and update skills (by codex)#21766DarkSharpness merged 4 commits intosgl-project:mainfrom
Conversation
|
/tag-and-rerun-ci |
There was a problem hiding this comment.
Code Review
This pull request introduces JIT-compiled activation kernels (SiLU, GELU, and GELU-Tanh with multiplication) to replace AOT implementations on CUDA platforms. The changes include the core CUDA kernel implementation, Python wrappers, unit tests, and benchmarks. Documentation is also updated to include guidance on JIT kernel development and the new PDL (Primary-to-Secondary Dependency Link) feature. Review feedback identifies a missing import for the HIP platform in the multimodal runtime and points out an incorrect output shape registration in the custom operator metadata for activations.
Co-authored-by: weiminc <tnwilly@gmail.com>
|
Hi @DarkSharpness, this PR was reverted due to this failure https://github.com/sgl-project/sglang/actions/runs/23958698449/job/69895069178?pr=21913 |
Motivation
Modifications
Accuracy Tests
Speed Tests and Profiling
(TL;DR: Performance gain is mostly from PDL and vectorization)
H200:
B200:
Checklist
Review and Merge Process
/tag-and-rerun-ci,/tag-run-ci-label,/rerun-failed-ci