Skip to content

fix: use get_rope_config() to support models without rope_parameters#21135

Merged
Fridge003 merged 10 commits intosgl-project:mainfrom
alphabetc1:fix/missing_repo_parameters
Mar 26, 2026
Merged

fix: use get_rope_config() to support models without rope_parameters#21135
Fridge003 merged 10 commits intosgl-project:mainfrom
alphabetc1:fix/missing_repo_parameters

Conversation

@alphabetc1
Copy link
Copy Markdown
Collaborator

@alphabetc1 alphabetc1 commented Mar 22, 2026

Motivation

PR #17784 ("Upgrade transformers==5.3.0") batch-replaced getattr(config, "rope_theta", ...) / getattr(config, "rope_scaling", ...) with direct config.rope_parameters["rope_theta"].
This works for built-in HuggingFace config classes (e.g. LlamaConfig) where rope_parameters is defined in __init__, but breaks for trust-remote-code models like GLM-4-MoE.

Affected trust-remote-code models:
glm4, glm4_moe, ernie4, exaone, hunyuan, minicpm, minicpm3,xverse, xverse_moe, baichuan, orion, deepseek, qwen, grok, solar, iquest_loopcoder, llada2, step3_vl

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a compatibility issue that arose after a recent transformers library upgrade, specifically affecting trust-remote-code models like GLM-4-MoE. The core problem was an AttributeError when accessing rope_parameters directly from custom model configurations. The solution involves refactoring the parameter retrieval mechanism to be more resilient, ensuring that rope_theta and rope_scaling are obtained safely, thus preventing model loading failures.

Highlights

  • Fix Glm4MoeConfig compatibility: Resolved an AttributeError in Glm4MoeConfig when loading trust-remote-code models, which occurred because a previous update (PR Upgrade transformers==5.3.0 #17784) assumed rope_parameters would always be directly available on the config object.
  • Robust RoPE parameter retrieval: Implemented a more robust method for retrieving rope_theta and rope_scaling by utilizing the get_rope_config utility function, ensuring compatibility with custom PretrainedConfig subclasses that may not define rope_parameters directly.
  • Updated partial_rotary_factor logic: Adjusted the logic for determining partial_rotary_factor to correctly use the rope_scaling dictionary returned by the new utility function, falling back to the config's direct attribute if necessary.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses an AttributeError for trust-remote-code models by introducing a helper function, get_rope_config, to robustly access RoPE parameters. The changes improve compatibility with different Hugging Face transformers versions. I have one suggestion to make the logic for retrieving partial_rotary_factor more robust against falsy values like 0.

@alphabetc1 alphabetc1 force-pushed the fix/missing_repo_parameters branch from ae0f0cf to b584dd1 Compare March 22, 2026 15:41
@alphabetc1 alphabetc1 changed the title fix: Glm4MoeConfig missing rope_parameters for trust-remote-code models fix: use get_rope_config() to support models without rope_parameters Mar 22, 2026
@alphabetc1 alphabetc1 force-pushed the fix/missing_repo_parameters branch from b584dd1 to 0a1d7a2 Compare March 22, 2026 16:12
@alphabetc1
Copy link
Copy Markdown
Collaborator Author

/tag-and-rerun-ci

@JustinTong0323
Copy link
Copy Markdown
Collaborator

/rerun-failed-ci

1 similar comment
@alphabetc1
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

@alphabetc1
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

@Fridge003
Copy link
Copy Markdown
Collaborator

@alphabetc1 Let's track this run https://github.com/sgl-project/sglang/pull/21135/checks?sha=f1c0883c37bbc50bd26725e6355160c5e748714f

@Fridge003 Fridge003 merged commit 646573e into sgl-project:main Mar 26, 2026
62 of 78 checks passed
@alphabetc1 alphabetc1 deleted the fix/missing_repo_parameters branch March 27, 2026 03:12
michaelzhang-ai added a commit that referenced this pull request Mar 27, 2026
The `get_rope_config()` helper crashes with `AttributeError:
'Grok1Config' object has no attribute 'rope_theta'` because the
Grok-1 HuggingFace config does not define `rope_theta` (it relies
on the standard default of 10000).

This was introduced in #21135 which migrated grok.py from a safe
`getattr(config, "rope_theta", 10000)` to `get_rope_config(config)`,
but the v4 fallback path in `get_rope_config()` accesses
`config.rope_theta` directly without `getattr`.

Fix:
- Use `getattr(config, "rope_theta", None)` in `get_rope_config()`
  so it never crashes on configs missing the attribute
- Restore the Grok-1 default of 10000 in `grok.py` when rope_theta
  is None

Fixes AMD nightly failures: nightly-8-gpu-grok1-int4 and
nightly-8-gpu-mi35x-grok1-int4 (both exit code 255).
michaelzhang-ai added a commit that referenced this pull request Mar 27, 2026
The Grok-1 HuggingFace config does not define `rope_theta` (it relies
on the standard default of 10000). After #21135 migrated grok.py from
`getattr(config, "rope_theta", 10000)` to `get_rope_config(config)`,
loading Grok-1 crashes with:

  AttributeError: 'Grok1Config' object has no attribute 'rope_theta'

Fix by replacing the `get_rope_config()` call in grok.py with local
rope_theta extraction that safely falls back to 10000, matching the
original behavior before #21135.

Fixes AMD nightly failures: nightly-8-gpu-grok1-int4 and
nightly-8-gpu-mi35x-grok1-int4 (both exit code 255).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants