[Core]Consolidate RoPE-related parsing into ModelArchitectureConfig by charlotte12l · Pull Request #32989 · vllm-project/vllm

charlotte12l · 2026-01-24T00:49:40Z

Purpose

We are introducing model_arch_config #28454, which defines explicitly what kind of information vLLM engine need from hugggingface config/ user-defined config, so we could avoid hf_config/getattr(hf_config, xxx) got passing around in engine.

Before this PR, _get_and_verify_max_len in model.py directly accessed multiple HuggingFace config fields:

rope_parameters
original_max_position_embeddings
model_max_length

By moving this logic into the convertor, we:

Centralize all RoPE-related decisions in one place
Reduce hf_config access in model.py

Behavior Differences/ Potential Bug Fix

Order of RoPE Scaling vs Caps
Before: RoPE scaling was applied AFTER sliding_window and tokenizer caps

Get base max_len from config
Apply sliding_window cap
Apply tokenizer cap
Apply RoPE scaling (yarn override + scaling_factor)

After: RoPE scaling is applied BEFORE caps

Get base max_len from config
Apply RoPE scaling (yarn override + scaling_factor) <- Moved here
Apply sliding_window cap
Apply tokenizer cap

Rationale: The new order is more correct. Sliding_window is a hard architectural limit that shouldn't be scaled up. The old behavior could result in incorrect values like sliding_window(2048) * factor(8) = 16384.

Test Plan

Select several models, run and get max_model_len groundtruth before this PR, compare if the values after this PR is the same.

pytest  tests/config/test_model_arch_config.py

Test Result

passed

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>

gemini-code-assist

Code Review

This pull request is a well-executed refactoring that consolidates RoPE-related configuration parsing into the ModelArchitectureConfig. By moving this logic from vllm/config/model.py to vllm/transformers_utils/model_arch_config_convertor.py, it centralizes all RoPE-related decisions, improving maintainability and reducing direct access to HuggingFace configs in model.py. A key improvement is the correction of the application order for RoPE scaling and context length caps, which resolves a potential bug where hard architectural limits like sliding_window were incorrectly scaled. The changes are clean, logical, and enhance the robustness of the configuration handling. The introduction of the MaxModelLenInfo named tuple provides a clear and well-documented interface for maximum length information. Overall, this is a high-quality contribution.

Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>

mergify · 2026-01-31T04:28:15Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @charlotte12l.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

charlotte12l added 4 commits January 23, 2026 11:14

refactor rope related

d2f7be8

Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>

refactor rope_parameters

92a9076

Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>

handle longrope

e8ea163

Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>

cleanup

77ff987

Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>

gemini-code-assist bot reviewed Jan 24, 2026

View reviewed changes

charlotte12l added 4 commits January 24, 2026 23:49

base model rope gt

835fe09

Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>

add tests

5902ecb

Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>

draft gt

a3e4f84

Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>

rename

be04dc1

Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>

charlotte12l changed the title ~~[Misc]Consolidate RoPE-related parsing into ModelArchitectureConfig~~ [Core]Consolidate RoPE-related parsing into ModelArchitectureConfig Jan 25, 2026

charlotte12l marked this pull request as ready for review January 25, 2026 08:10

charlotte12l requested review from ProExpertProg, WoosukKwon, hmellor, houseroad, mgoin, robertgshaw2-redhat, tlrmchlsmth, yewentao256 and youkaichao as code owners January 25, 2026 08:10

mergify bot added the needs-rebase label Jan 31, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Core]Consolidate RoPE-related parsing into ModelArchitectureConfig#32989

[Core]Consolidate RoPE-related parsing into ModelArchitectureConfig#32989
charlotte12l wants to merge 8 commits intovllm-project:mainfrom
charlotte12l:rope_refactor

charlotte12l commented Jan 24, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

mergify bot commented Jan 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

charlotte12l commented Jan 24, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Behavior Differences/ Potential Bug Fix

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

mergify bot commented Jan 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

charlotte12l commented Jan 24, 2026 •

edited by github-actions bot

Loading