[NSA] Fall back to fast_hadamard_transform when sgl_kernel lacks the symbol#23699
Merged
Merged
Conversation
…ks the symbol Older sgl_kernel builds (e.g. 0.3.21) don't export hadamard_transform. On non-HIP / non-SM103 hardware the import then raises ImportError at forward time and crashes the scheduler. fast_hadamard_transform is already a dependency on those paths, so use it as a fallback when sgl_kernel is missing the symbol. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
nsa_indexer.rotate_activationdoesfrom sgl_kernel import hadamard_transform. Oldersgl_kernelbuilds (e.g.0.3.21) don't export this symbol, which causesImportErrorat forward time and crashes the scheduler. Wrap the import intry/exceptand fall back tofast_hadamard_transform(already used on HIP / SM103), so older / partialsgl_kernelbuilds keep working.Why
lmsysorg/sglang:deepseek-v4-grace-blackwellimage only when running on non-GB300 CUDA hardware where_is_sm103=False. On GB300 it's masked because_is_sm103=Truealready routes tofast_hadamard_transform. But any non-Blackwell / non-GB300 CUDA platform hits this:fast_hadamard_transformis already an installed dependency on those code paths, so the fallback adds no new requirement.Diff
Test plan
lmsysorg/sglang:deepseek-v4-grace-blackwell(sgl-kernel 0.3.21 — nohadamard_transform):_is_sm103=Truepath still chosen; gsm8k 20-shot sanity acc 0.949 (Flash low-latency) — unchanged.fridge003/sglang:final-gb300where_is_sm103=Falsebecause the source predates SM103 detection): without this patch, the scheduler crashed at first forward; with the equivalent patch applied locally, gsm8k 20-shot acc 0.950 — fix confirmed.deepseek_v4(let upstream run it).🤖 Generated with Claude Code