Skip to content

[Fix] Fix extra uninstall of cutlass packages#25756

Merged
Fridge003 merged 5 commits into
mainfrom
fix-fa4
May 19, 2026
Merged

[Fix] Fix extra uninstall of cutlass packages#25756
Fridge003 merged 5 commits into
mainfrom
fix-fa4

Conversation

@Fridge003

@Fridge003 Fridge003 commented May 19, 2026

Copy link
Copy Markdown
Collaborator

Motivation

Which might fix the error mentioned in #25743

Modifications

Accuracy Tests

Speed Tests and Profiling

Checklist

Review and Merge Process

  1. Ping Merge Oncalls to start the process. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
  4. After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

CI States

Latest PR Test (Base): ✅ Run #26087085502
Latest PR Test (Extra): ⚠️ Not enabled -- add run-ci-extra label to opt in.

@github-actions github-actions Bot added the dependencies Pull requests that update a dependency file label May 19, 2026
@Fridge003

Copy link
Copy Markdown
Collaborator Author

/rerun-test test/registered/lora/test_lora_qwen3_8b_logprob_diff.py

@Fridge003

Copy link
Copy Markdown
Collaborator Author

/rerun-test test/registered/attention/test_flash_attention_4.py

@github-actions

github-actions Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

🚀 1-gpu-h100 (1 test): ❌ View workflow run

cd test/ && python3 registered/lora/test_lora_qwen3_8b_logprob_diff.py

@github-actions

github-actions Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

🚀 4-gpu-b200 (1 test): ❌ View workflow run

cd test/ && python3 registered/attention/test_flash_attention_4.py

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the flash-attn-4 dependency in pyproject.toml by adding the [cu13] extra. The reviewer recommended pinning this dependency to a specific version rather than leaving it unconstrained to ensure build stability and reproducibility.

Comment thread python/pyproject.toml Outdated
@Fridge003 Fridge003 changed the title [Fix] Upstream FA4 package to latest version [Fix] Fix extra uninstall of cutlass packages May 19, 2026
@Fridge003

Copy link
Copy Markdown
Collaborator Author

/rerun-test test/registered/lora/test_lora_qwen3_8b_logprob_diff.py

@github-actions

github-actions Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

🚀 1-gpu-h100 (1 test): ❌ View workflow run

cd test/ && python3 registered/lora/test_lora_qwen3_8b_logprob_diff.py

@Fridge003

Copy link
Copy Markdown
Collaborator Author

/rerun-test test/registered/lora/test_lora_qwen3_8b_logprob_diff.py

@github-actions

github-actions Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

🚀 1-gpu-h100 (1 test): ✅ View workflow run

cd test/ && python3 registered/lora/test_lora_qwen3_8b_logprob_diff.py

@Fridge003

Copy link
Copy Markdown
Collaborator Author

/tag-and-rerun-ci

@Fridge003 Fridge003 merged commit cd012ad into main May 19, 2026
176 of 180 checks passed
@Fridge003 Fridge003 deleted the fix-fa4 branch May 19, 2026 17:01
Kangyan-Zhou added a commit to Kangyan-Zhou/sglang that referenced this pull request May 21, 2026
PR sgl-project#25576 bumped nvidia-cutlass-dsl[cu13] from 4.5.0 to 4.5.1. The bump
exposed a latent file-level conflict between -libs-base and -libs-cu13
(both written by the additive [cu13] extra) as a hard GPUModuleOp
TypeError on H100: -libs-cu13's pybind11 binding changed to the new
MLIR-style ((operation: object)) without a matching bump to the Python
wrapper in nvidia-cutlass-dsl, so loading -libs-cu13's .so makes the
wrapper's old-style super().__init__() call fail.

Two changes:

1. Revert the version bump (4.5.1 -> 4.5.0). At 4.5.0 both .so files
   expose a compatible binding, so the same coexistence no longer crashes.
   This removes the active TypeError on H100 and on the CUDA-13 Docker
   image for non-Blackwell users.

2. Add fix_cutlass_dsl_libs() to ci_install_dependency.sh, called from
   main() after download_flashinfer_cache. The function picks the right
   libs package per GPU family even at 4.5.0 to avoid two independent
   regressions that the silent conflict could still hit:

     Blackwell (IS_BLACKWELL=1, CU13):
       Purge -libs-base, force-reinstall -libs-cu13 so its files take
       precedence. -libs-base is CUDA-12.9-built and lacks the sm_110
       arch alias that GB300/B200 need at cutlass import time.

     Non-Blackwell CU13 (H100, H200):
       Purge -libs-cu13, force-reinstall -libs-base. -libs-cu13 carries
       a CUDBG_EXCEPTION_WARP_ILLEGAL_ADDRESS regression in LoRA CUDA-
       graph capture on sm_90 (sgl-project#25743 / reverted by sgl-project#25756).

     Non-CU13: no-op (only -libs-base ever installed).
Kangyan-Zhou added a commit to Kangyan-Zhou/sglang that referenced this pull request May 21, 2026
Revert the version bump from PR sgl-project#25576. At 4.5.1, -libs-cu13's pybind11
binding changed to new MLIR-style ((operation: object)) without a
matching bump to the Python wrapper in nvidia-cutlass-dsl, exposing the
latent file-level conflict between -libs-base and -libs-cu13 (both
written by the additive [cu13] extra) as a hard GPUModuleOp TypeError
at kernel-compile time on CU13 runners.

At 4.5.0 both .so files expose a compatible binding, so the same
coexistence is silent and CI was empirically green on H100 and Blackwell
during the post-sgl-project#25756, pre-sgl-project#25576 window. Going back to 4.5.0 restores
that state.

Supersedes sgl-project#25935 (which proposed the same revert but was closed).
Shunkangz pushed a commit to Shunkangz/sglang that referenced this pull request May 27, 2026
alphabetc1 pushed a commit to alphabetc1/sglang that referenced this pull request Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant