Skip to content

chore: bump flashinfer version to 0.6.7#21422

Merged
Fridge003 merged 17 commits intomainfrom
bot/bump-flashinfer-version-0.6.7-d69d
Apr 1, 2026
Merged

chore: bump flashinfer version to 0.6.7#21422
Fridge003 merged 17 commits intomainfrom
bot/bump-flashinfer-version-0.6.7-d69d

Conversation

@sglang-bot
Copy link
Copy Markdown
Member

@sglang-bot sglang-bot commented Mar 25, 2026

Summary

This PR bumps the flashinfer version to 0.6.7 across all relevant files.

Fix these bugs:

Fix #19081
Fix #18989
Fix #18980

, this version include this commit: flashinfer-ai/flashinfer#2726

Files Updated

  • docker/Dockerfile
  • python/pyproject.toml
  • python/sglang/srt/entrypoints/engine.py
  • python/sglang/srt/utils/common.py

🤖 Generated with GitHub Actions

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label Mar 25, 2026
@Fridge003
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@b8zhong
Copy link
Copy Markdown
Collaborator

b8zhong commented Mar 25, 2026

/rerun-failed-ci again??

@Fridge003
Copy link
Copy Markdown
Collaborator

Piecewise cuda graph failuer: #21452

@b8zhong
Copy link
Copy Markdown
Collaborator

b8zhong commented Mar 26, 2026

RMSNorm failure (?) Seem related https://github.com/sgl-project/sglang/actions/runs/23566614089/job/68672358288?pr=21422

This commit updates the flashinfer version across all relevant files:
          - docker/Dockerfile
          - python/pyproject.toml
          - python/sglang/srt/entrypoints/engine.py
          - python/sglang/srt/utils/common.py

🤖 Generated with GitHub Actions
@b8zhong b8zhong force-pushed the bot/bump-flashinfer-version-0.6.7-d69d branch from a428d43 to d804ed9 Compare March 27, 2026 18:30
@zianglih
Copy link
Copy Markdown
Contributor

#21625 will fix v0.6.7 flaky test_fp8_blockwise_gemm.py

…6.7 compatibility

Flashinfer 0.6.7 switched to CuTe-based kernels with stricter dtype
validation for rmsnorm. When weight (fp32) and input (bf16/fp16) dtypes
mismatch, the new kernels raise ValueError. Cast weight to input dtype
before calling flashinfer norm functions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added the lora label Mar 30, 2026
@Fridge003 Fridge003 merged commit ca3ba05 into main Apr 1, 2026
566 of 635 checks passed
@Fridge003 Fridge003 deleted the bot/bump-flashinfer-version-0.6.7-d69d branch April 1, 2026 04:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

5 participants