Skip to content

[Fix] Add default rope theta for qwen1 model#30369

Closed
iwzbi wants to merge 1 commit intovllm-project:mainfrom
iwzbi:fix/qwen_rope
Closed

[Fix] Add default rope theta for qwen1 model#30369
iwzbi wants to merge 1 commit intovllm-project:mainfrom
iwzbi:fix/qwen_rope

Conversation

@iwzbi
Copy link
Contributor

@iwzbi iwzbi commented Dec 10, 2025

Need to set default rope theta for qwen1 model.

error:

image

Qwen1 config:
image

Purpose

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@iwzbi iwzbi requested a review from sighingnow as a code owner December 10, 2025 02:36
@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

@mergify mergify bot added the qwen Related to Qwen models label Dec 10, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix a KeyError for Qwen1 models that are missing the rope_theta parameter in their configuration. The fix involves adding a default value for rope_theta. While the change correctly addresses the issue, the implementation can be improved for efficiency and code quality by moving the function call to a more appropriate location where it is executed only once during model initialization, instead of for every layer.

@@ -149,6 +150,7 @@ def __init__(
prefix: str = "",
):
super().__init__()
set_default_rope_theta(config, default_theta=10000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Calling set_default_rope_theta in QWenBlock.__init__ is inefficient as it's executed for every model layer. This function should be called only once per model initialization.

Please move this call to the beginning of QWenModel.__init__ (e.g., after line 201) to ensure it runs only once. This will improve model loading efficiency and align with best practices seen in other models in this repository.

go
Signed-off-by: root <iwzbi@zju.edu.cn>
@DarkLight1337
Copy link
Member

cc @hmellor

@hmellor
Copy link
Member

hmellor commented Dec 10, 2025

Could you please share the checkpoint you are using so I can try to reproduce the error?

The latest main has:

rope_theta_names = ("rope_theta", "rotary_emb_base")
rope_theta = getattr_iter(config, rope_theta_names, None)
if Version(version("transformers")) < Version("5.0.0.dev0"):
# Transformers v4 installed, legacy config fields may be present
if (rope_scaling := getattr(config, "rope_scaling", None)) is not None:
config.rope_parameters = rope_scaling
if rope_theta is not None:
if not hasattr(config, "rope_parameters"):
config.rope_parameters = {"rope_type": "default"}
config.rope_parameters["rope_theta"] = rope_theta

Which means that rope_parameters should be present for the config in the screenshot.

@iwzbi
Copy link
Contributor Author

iwzbi commented Dec 11, 2025

Could you please share the checkpoint you are using so I can try to reproduce the error?

The latest main has:

rope_theta_names = ("rope_theta", "rotary_emb_base")
rope_theta = getattr_iter(config, rope_theta_names, None)
if Version(version("transformers")) < Version("5.0.0.dev0"):
# Transformers v4 installed, legacy config fields may be present
if (rope_scaling := getattr(config, "rope_scaling", None)) is not None:
config.rope_parameters = rope_scaling
if rope_theta is not None:
if not hasattr(config, "rope_parameters"):
config.rope_parameters = {"rope_type": "default"}
config.rope_parameters["rope_theta"] = rope_theta

Which means that rope_parameters should be present for the config in the screenshot.

please try this model: https://huggingface.co/Qwen/Qwen-1_8B-Chat
thanks!

@hmellor
Copy link
Member

hmellor commented Dec 11, 2025

I've confirmed that this issue is not present when using vLLM with Transformers 4.57.3. It is only present when using Transformesr main branch, which vLLM does not currently support.

The reason for the difference is that in Transformers v4 we hand roll our own standardisation, but for v5 we use the new built in standardisation:

rope_theta_names = ("rope_theta", "rotary_emb_base")
rope_theta = getattr_iter(config, rope_theta_names, None)
if Version(version("transformers")) < Version("5.0.0.dev0"):
# Transformers v4 installed, legacy config fields may be present
if (rope_scaling := getattr(config, "rope_scaling", None)) is not None:
config.rope_parameters = rope_scaling
if rope_theta is not None:
if not hasattr(config, "rope_parameters"):
config.rope_parameters = {"rope_type": "default"}
config.rope_parameters["rope_theta"] = rope_theta
partial_rotary_factor_names = ("partial_rotary_factor", "rotary_pct")
partial_rotary_factor = getattr_iter(config, partial_rotary_factor_names, None)
if partial_rotary_factor is not None:
if not hasattr(config, "rope_parameters"):
config.rope_parameters = {"rope_type": "default"}
config.rope_parameters["partial_rotary_factor"] = partial_rotary_factor
elif rope_theta is not None or hasattr(config, "rope_parameters"):
# Transformers v5 installed
config.standardize_rope_params()
config.validate_rope()

The v5 standardisation doesn't check the non-standard names (rotary_emb_base, rotary_pct, rotary_emb_fraction).

Since these non-standard names only appear in custom models, it doesn't make sense to check them in Transformers itself. I'll make a PR which ensures that these old custom models are forward compatible.

@hmellor hmellor closed this Dec 11, 2025
@hmellor
Copy link
Member

hmellor commented Dec 11, 2025

This should be fixed in 69bfda0 of #30389

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

qwen Related to Qwen models

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants