[CI Failure] Fix NomicBert max_model_len validation by noooop · Pull Request #31662 · vllm-project/vllm

noooop · 2026-01-04T03:44:25Z

Purpose

Fix NomicBert max_model_len validation.
rm VllmConfig.recalculate_max_model_len.

It was introduced in #18755 and is only used in the NomicBert max_model_len validation. Deleting it prevents potential misuse.

Adapted from #31632

Test Plan

tests/models/language/pooling_mteb_test/test_nomic.py

Test Result

pass

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request correctly fixes a bug in the max_model_len validation for NomicBert models, particularly when context extension is used. The core of the fix involves updating the cached derived_max_model_len before validation, which is a crucial and correct change. Additionally, the PR includes a nice refactoring by removing the VllmConfig.recalculate_max_model_len method and adjusting the NomicBertModelConfig method signature to reduce dependencies. The changes are sound, improve code clarity, and I have no further comments.

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

noooop · 2026-01-04T03:49:57Z

hi @charlotte12l

The functionality of model_arch_config_convertor and verify_and_update_model_config is quite similar. Using verify_and_update_model_config can help avoid placing the logic for NomicBertModel in two separate locations.

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

noooop · 2026-01-04T04:13:03Z

hi @charlotte12l

Now model_arch_config_convertor and verify_and_update_model_config have the same functionality.

Could you have a good way to combine them?

verify_and_update_model_config Introduce in #31131
VerifyAndUpdateConfig Introduce in #20012

charlotte12l · 2026-01-04T05:42:47Z

Moving the discussion from #31632 (comment) to here.

For context, model_arch_config was primarily for explicitly defining which fields will be used by vllm engine, so users can work with an arbitrary hf_config easily.
If I understand correctly, verify_and_update_config is for vllm_config and includes cache_config, parallel_config, etc.

My initial thoughts are:

We could mainly focus on consolidating the update/verify about the hf_config , while leave the other parts unchanged.

For those logics that update hf_config:

move the updates for the fields -- e.g. hidden_size, num_hidden_layers, max_len, rope related (note that I haven't include it into model_arch_config, but that's a plan to do ) etc. - -- that are used by vllm engine into model_arch_config_convertor
for fields updates that's only impact the model definition(e.g. layer_norm_eps,

vllm/vllm/model_executor/models/config.py

Line 136 in 268b1c5

config.layer_norm_eps = config.layer_norm_epsilon

) , we could just let the corresponding model definition to read hf_config.layer_norm_epsilon . Noob question for this: why we need to update hf_config.layer_norm_eps in the first place, is somewhere in vllm engine checking layer_norm_eps?
For some tricky fields like method (

vllm/vllm/model_executor/models/config.py

Line 243 in 268b1c5

model_config.hf_config.method = "from_2_way_softmax"

), is_causal (

vllm/vllm/model_executor/models/config.py

Line 100 in 268b1c5

hf_config.is_causal = False

), I need to take a deeper look for the design.

Appreciate any suggestions!

noooop · 2026-01-04T06:28:06Z

Moving the discussion from #31632 (comment) to here.

For context, model_arch_config was primarily for explicitly defining which fields will be used by vllm engine, so users can work with an arbitrary hf_config easily. If I understand correctly, verify_and_update_config is for vllm_config and includes cache_config, parallel_config, etc.

My initial thoughts are:

We could mainly focus on consolidating the update/verify about the hf_config , while leave the other parts unchanged.

For those logics that update hf_config:

move the updates for the fields -- e.g. hidden_size, num_hidden_layers, max_len, rope related (note that I haven't include it into model_arch_config, but that's a plan to do ) etc. - -- that are used by vllm engine into model_arch_config_convertor

for fields updates that's only impact the model definition(e.g. layer_norm_eps,

vllm/vllm/model_executor/models/config.py

Line 136 in 268b1c5

config.layer_norm_eps = config.layer_norm_epsilon

) , we could just let the corresponding model definition to read hf_config.layer_norm_epsilon . Noob question for this: why we need to update hf_config.layer_norm_eps in the first place, is somewhere in vllm engine checking layer_norm_eps?

For some tricky fields like method (

vllm/vllm/model_executor/models/config.py

Line 243 in 268b1c5

model_config.hf_config.method = "from_2_way_softmax"

), is_causal (

vllm/vllm/model_executor/models/config.py

Line 100 in 268b1c5

hf_config.is_causal = False

), I need to take a deeper look for the design.

Appreciate any suggestions!

Unfortunately, is_causal, hf_config.method, and others were added by me......

I will gradually mv these field into model_arch_config_convertor.py

I'm not sure whether to directly add these fields to ModelArchitectureConfig or to create ModelArchitectureConfigPooling.

charlotte12l · 2026-01-04T06:45:53Z

Thank you! Let's still use ModelArchitectureConfig but maybe putting all pooling related fields in a class. Which parts from hf_config are important for vllm engine if it's a pooling model, is_causal, hidden_act, pooling? I didn't find the usage of method though.

vllm/model_executor/models/config.py

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

mergify · 2026-01-04T07:11:07Z

Hi @noooop, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

AndreasKaratzas · 2026-01-05T01:03:41Z

cc @DarkLight1337

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

noooop requested review from ProExpertProg, WoosukKwon, hmellor, houseroad, mgoin, robertgshaw2-redhat, tlrmchlsmth, yewentao256 and youkaichao as code owners January 4, 2026 03:44

gemini-code-assist bot reviewed Jan 4, 2026

View reviewed changes

noooop mentioned this pull request Jan 4, 2026

[CI] Skip Phi-MoE test due to old API util #31632

Merged

init

9576aac

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

noooop force-pushed the fix_nomic branch from 6fa90e8 to 9576aac Compare January 4, 2026 03:47

update

d9e644f

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

noooop added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 4, 2026

noooop requested a review from DarkLight1337 January 4, 2026 04:40

charlotte12l reviewed Jan 4, 2026

View reviewed changes

vllm/model_executor/models/config.py Show resolved Hide resolved

noooop added 2 commits January 4, 2026 15:00

update

c10151a

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

update

e23ced4

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

noooop removed the ready ONLY add when PR is ready to merge/full CI is needed label Jan 4, 2026

Merge branch 'main' into fix_nomic

11d6fba

update

ee49a9f

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

Merge branch 'main' into fix_nomic

1a20610

noooop added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 5, 2026

DarkLight1337 approved these changes Jan 5, 2026

View reviewed changes

noooop merged commit 8be6432 into vllm-project:main Jan 5, 2026
57 checks passed

noooop deleted the fix_nomic branch January 5, 2026 03:06

LucasWilkinson pushed a commit to neuralmagic/vllm that referenced this pull request Jan 6, 2026

[CI Failure] Fix NomicBert max_model_len validation (vllm-project#31662)

63de4b3

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026

[CI Failure] Fix NomicBert max_model_len validation (vllm-project#31662)

8f686f6

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026

[CI Failure] Fix NomicBert max_model_len validation (vllm-project#31662)

8abdfb9

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026

[CI Failure] Fix NomicBert max_model_len validation (vllm-project#31662)

7b31b04

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026

[CI Failure] Fix NomicBert max_model_len validation (vllm-project#31662)

3e3f772

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI Failure] Fix NomicBert max_model_len validation#31662

[CI Failure] Fix NomicBert max_model_len validation#31662
noooop merged 7 commits intovllm-project:mainfrom
noooop:fix_nomic

noooop commented Jan 4, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

noooop commented Jan 4, 2026

Uh oh!

noooop commented Jan 4, 2026 •

edited

Loading

Uh oh!

charlotte12l commented Jan 4, 2026 •

edited

Loading

Uh oh!

noooop commented Jan 4, 2026 •

edited

Loading

Uh oh!

charlotte12l commented Jan 4, 2026

Uh oh!

Uh oh!

mergify bot commented Jan 4, 2026

Uh oh!

AndreasKaratzas commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

noooop commented Jan 4, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

noooop commented Jan 4, 2026

Uh oh!

noooop commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charlotte12l commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

noooop commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charlotte12l commented Jan 4, 2026

Uh oh!

Uh oh!

mergify bot commented Jan 4, 2026

Uh oh!

AndreasKaratzas commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

noooop commented Jan 4, 2026 •

edited by github-actions bot

Loading

noooop commented Jan 4, 2026 •

edited

Loading

charlotte12l commented Jan 4, 2026 •

edited

Loading

noooop commented Jan 4, 2026 •

edited

Loading