[wip] Support Transformers v5.x by byjiang1996 · Pull Request #14584 · sgl-project/sglang

byjiang1996 · 2025-12-07T18:23:26Z

Motivation

From #14356, it is required for supporting glm 4.6v

Co-authored-by: @yhyang201

Modifications

pip install required packages and bump package versions
alwasy rebuild sgl-kernel's flash_attn even if flash_attn exists in the pip: Rebuilding from flash_attn main branch is required for fa4 as fa4 is not yet released into pip dist.

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.
Work with maintainers to merge your PR. See the PR Merge Process

byjiang1996 · 2025-12-07T18:23:40Z

/tag-and-rerun-ci

gemini-code-assist · 2025-12-07T18:23:45Z

Summary of Changes

Hello @byjiang1996, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request updates the project's dependency stack to support the upcoming Transformers v5.x, which is crucial for integrating with newer models like GLM 4.6v. The changes involve upgrading the transformers package to its release candidate version and introducing flash_attn as a direct dependency. To ensure compatibility with the latest Flash Attention features, specifically FA4, the build process for sgl-kernel has been adjusted to force a fresh installation of flash_attn from its main branch. Additionally, some internal configuration validation logic related to rope_config_validation has been removed to align with the updated Transformers API.

Highlights

Dependency Updates: Upgraded the transformers library to 5.0.0rc0 across all pyproject.toml files and added flash_attn as a new dependency.
Flash Attention Build Process: Modified the sgl-kernel build system to ensure a forced rebuild and proper installation of flash_attn from its main branch, specifically to support FA4 which is not yet released via pip.
Transformers API Adaptation: Removed references and calls to rope_config_validation in qwen3_omni.py, adapting to potential API changes in Transformers v5.x.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request updates the project to support transformers v5.x by bumping the version to 5.0.0rc0 and adding flash_attn as a dependency. The changes also include necessary adaptations in the codebase to accommodate breaking changes in the new transformers version, such as the removal of rope_config_validation. Additionally, the build process for sgl-kernel is modified to ensure flash_attn is built from source to include support for fa4.

My feedback focuses on improving dependency management for better reproducibility. While using a release candidate for transformers is a conscious choice for this upgrade, it's important to be aware of potential stability risks. The changes to the build system are well-justified workarounds.

gemini-code-assist · 2025-12-07T18:25:31Z

python/pyproject.toml

  "fastapi",
  "flashinfer_python==0.5.3", # keep it aligned with jit-cache version in Dockerfile
  "flashinfer_cubin==0.5.3",
+  "flash_attn", # required by 5.x transformer


It's a good practice to pin dependency versions to ensure reproducible builds. The flash_attn dependency is added without a version specifier, which could lead to unexpected issues if a new, incompatible version is released. Please consider pinning it to a specific version (e.g., flash_attn=="x.y.z") or a minimum compatible version (e.g., flash_attn>="x.y.z").

…easily in CI

yhyang201 and others added 3 commits December 7, 2025 09:17

upd

9c001e9

upd

c9e491a

Fix

e84addb

byjiang1996 requested review from BBuf, FlamingoPg, Fridge003, HaiShaw, ispobock, merrymercy, yizhang2077 and zhyncs as code owners December 7, 2025 18:23

github-actions bot added dependencies Pull requests that update a dependency file sgl-kernel labels Dec 7, 2025

github-actions bot added the run-ci label Dec 7, 2025

gemini-code-assist bot reviewed Dec 7, 2025

View reviewed changes

Remove flash_attn from pyproject.toml because it cannot be installed …

05d4f4f

…easily in CI

byjiang1996 force-pushed the transformers-v5 branch from 30df140 to 05d4f4f Compare December 7, 2025 19:08

byjiang1996 requested a review from ishandhanani as a code owner December 7, 2025 19:08

Merge branch 'main' into transformers-v5

e8066bd

byjiang1996 force-pushed the transformers-v5 branch from d80f20b to e8066bd Compare December 7, 2025 19:09

Fix flash_attn again

fd6460c

byjiang1996 mentioned this pull request Dec 7, 2025

[Glm46v] Bug fix for accuracy drop and unable to launch server #14585

Merged

6 tasks

byjiang1996 added 5 commits December 7, 2025 12:01

Fix flash_attn again

25c8d12

Fix Cmakelist.txt for flash_attn again

09f708e

Change ci_install_dependency.sh ordering

3a6d205

Remove rope_config_validation as it does not exist in transformer 5.x

38d31ca

Remove flash_attn installation in ci_install_dependency.shj

ebf9578

byjiang1996 changed the title ~~Support Transformers v5.x~~ [wip] Support Transformers v5.x Dec 8, 2025

force uninstall flash_attn manually

c3a3bad

byjiang1996 force-pushed the transformers-v5 branch from c32f2c3 to c3a3bad Compare December 8, 2025 03:15

...

86b2fea

byjiang1996 force-pushed the transformers-v5 branch from 7246348 to 86b2fea Compare December 8, 2025 03:20

byjiang1996 closed this Dec 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[wip] Support Transformers v5.x#14584

[wip] Support Transformers v5.x#14584
byjiang1996 wants to merge 13 commits intosgl-project:mainfrom
byjiang1996:transformers-v5

byjiang1996 commented Dec 7, 2025 •

edited

Loading

Uh oh!

byjiang1996 commented Dec 7, 2025

Uh oh!

gemini-code-assist bot commented Dec 7, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

byjiang1996 commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

byjiang1996 commented Dec 7, 2025

Uh oh!

gemini-code-assist bot commented Dec 7, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

byjiang1996 commented Dec 7, 2025 •

edited

Loading