[Bugifx] qwen3 rope parameter compatibility#20931
[Bugifx] qwen3 rope parameter compatibility#20931Fridge003 merged 3 commits intosgl-project:mainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the compatibility of SGLang with Qwen3 MoE models by addressing inconsistencies in how RoPE (Rotary Positional Embeddings) parameters are defined in model configurations. The changes ensure that the system can correctly parse and apply RoPE settings from various Qwen3 checkpoint formats, thereby preventing loading errors and improving the overall robustness of model integration. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a fix for RoPE parameter compatibility in Qwen3 models, allowing them to load from different checkpoint formats. The logic correctly handles both newer rope_parameters and older rope_theta/rope_scaling configurations. My review identified a critical issue where the calculated rope_theta is not stored as an instance variable, which would lead to a runtime error in apply_qk_norm_rope. I have provided a code suggestion to fix this.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
Use sglang/python/sglang/srt/utils/hf_transformers_utils.py Lines 133 to 144 in 4c52b7f |
|
/tag-run-ci-label |
lviy
left a comment
There was a problem hiding this comment.
Thanks for your update, it looks good!
|
/tag-and-rerun-ci |
Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Motivation
To fix #20932
SGLang fails to load some Qwen3 MoE checkpoints whose HF config uses top-level
rope_theta(v4-style) but does not definerope_parameters(v5-style).Modifications
This PR updated qwen3moe.py to make RoPE config parsing compatible across checkpoint formats: it now checks whether RoPE is provided as config.rope_parameters (new style) or as top-level config.rope_theta/config.rope_scaling (old style), and uses the available field accordingly. This prevents load failures caused by different RoPE parameter naming in Qwen3 configs.
Accuracy Tests
Benchmarking and Profiling
Checklist
Review Process
/tag-run-ci-label,/rerun-failed-ci,/tag-and-rerun-ci