Skip to content

[wip] Support Transformers v5.x#14584

Closed
byjiang1996 wants to merge 13 commits intosgl-project:mainfrom
byjiang1996:transformers-v5
Closed

[wip] Support Transformers v5.x#14584
byjiang1996 wants to merge 13 commits intosgl-project:mainfrom
byjiang1996:transformers-v5

Conversation

@byjiang1996
Copy link
Collaborator

@byjiang1996 byjiang1996 commented Dec 7, 2025

Motivation

From #14356, it is required for supporting glm 4.6v

Co-authored-by: @yhyang201

Modifications

  • pip install required packages and bump package versions
  • alwasy rebuild sgl-kernel's flash_attn even if flash_attn exists in the pip: Rebuilding from flash_attn main branch is required for fa4 as fa4 is not yet released into pip dist.

Accuracy Tests

Benchmarking and Profiling

Checklist

@github-actions github-actions bot added dependencies Pull requests that update a dependency file sgl-kernel labels Dec 7, 2025
@byjiang1996
Copy link
Collaborator Author

/tag-and-rerun-ci

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @byjiang1996, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request updates the project's dependency stack to support the upcoming Transformers v5.x, which is crucial for integrating with newer models like GLM 4.6v. The changes involve upgrading the transformers package to its release candidate version and introducing flash_attn as a direct dependency. To ensure compatibility with the latest Flash Attention features, specifically FA4, the build process for sgl-kernel has been adjusted to force a fresh installation of flash_attn from its main branch. Additionally, some internal configuration validation logic related to rope_config_validation has been removed to align with the updated Transformers API.

Highlights

  • Dependency Updates: Upgraded the transformers library to 5.0.0rc0 across all pyproject.toml files and added flash_attn as a new dependency.
  • Flash Attention Build Process: Modified the sgl-kernel build system to ensure a forced rebuild and proper installation of flash_attn from its main branch, specifically to support FA4 which is not yet released via pip.
  • Transformers API Adaptation: Removed references and calls to rope_config_validation in qwen3_omni.py, adapting to potential API changes in Transformers v5.x.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions github-actions bot added the run-ci label Dec 7, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the project to support transformers v5.x by bumping the version to 5.0.0rc0 and adding flash_attn as a dependency. The changes also include necessary adaptations in the codebase to accommodate breaking changes in the new transformers version, such as the removal of rope_config_validation. Additionally, the build process for sgl-kernel is modified to ensure flash_attn is built from source to include support for fa4.

My feedback focuses on improving dependency management for better reproducibility. While using a release candidate for transformers is a conscious choice for this upgrade, it's important to be aware of potential stability risks. The changes to the build system are well-justified workarounds.

"fastapi",
"flashinfer_python==0.5.3", # keep it aligned with jit-cache version in Dockerfile
"flashinfer_cubin==0.5.3",
"flash_attn", # required by 5.x transformer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It's a good practice to pin dependency versions to ensure reproducible builds. The flash_attn dependency is added without a version specifier, which could lead to unexpected issues if a new, incompatible version is released. Please consider pinning it to a specific version (e.g., flash_attn=="x.y.z") or a minimum compatible version (e.g., flash_attn>="x.y.z").

@byjiang1996 byjiang1996 changed the title Support Transformers v5.x [wip] Support Transformers v5.x Dec 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file run-ci sgl-kernel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants