Skip to content

[CI Failure] Fix NomicBert max_model_len validation#31662

Merged
noooop merged 7 commits intovllm-project:mainfrom
noooop:fix_nomic
Jan 5, 2026
Merged

[CI Failure] Fix NomicBert max_model_len validation#31662
noooop merged 7 commits intovllm-project:mainfrom
noooop:fix_nomic

Conversation

@noooop
Copy link
Collaborator

@noooop noooop commented Jan 4, 2026

Purpose

  • Fix NomicBert max_model_len validation.

  • rm VllmConfig.recalculate_max_model_len.

It was introduced in #18755 and is only used in the NomicBert max_model_len validation. Deleting it prevents potential misuse.

Adapted from #31632

Test Plan

tests/models/language/pooling_mteb_test/test_nomic.py

Test Result

pass


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes a bug in the max_model_len validation for NomicBert models, particularly when context extension is used. The core of the fix involves updating the cached derived_max_model_len before validation, which is a crucial and correct change. Additionally, the PR includes a nice refactoring by removing the VllmConfig.recalculate_max_model_len method and adjusting the NomicBertModelConfig method signature to reduce dependencies. The changes are sound, improve code clarity, and I have no further comments.

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
@noooop
Copy link
Collaborator Author

noooop commented Jan 4, 2026

hi @charlotte12l

The functionality of model_arch_config_convertor and verify_and_update_model_config is quite similar. Using verify_and_update_model_config can help avoid placing the logic for NomicBertModel in two separate locations.

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
@noooop
Copy link
Collaborator Author

noooop commented Jan 4, 2026

hi @charlotte12l

Now model_arch_config_convertor and verify_and_update_model_config have the same functionality.

Could you have a good way to combine them?


verify_and_update_model_config Introduce in #31131
VerifyAndUpdateConfig Introduce in #20012

@noooop noooop added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 4, 2026
@noooop noooop requested a review from DarkLight1337 January 4, 2026 04:40
@charlotte12l
Copy link
Contributor

charlotte12l commented Jan 4, 2026

Moving the discussion from #31632 (comment) to here.

For context, model_arch_config was primarily for explicitly defining which fields will be used by vllm engine, so users can work with an arbitrary hf_config easily.
If I understand correctly, verify_and_update_config is for vllm_config and includes cache_config, parallel_config, etc.

My initial thoughts are:

  • We could mainly focus on consolidating the update/verify about the hf_config , while leave the other parts unchanged.

For those logics that update hf_config:

  • move the updates for the fields -- e.g. hidden_size, num_hidden_layers, max_len, rope related (note that I haven't include it into model_arch_config, but that's a plan to do ) etc. - -- that are used by vllm engine into model_arch_config_convertor
  • for fields updates that's only impact the model definition(e.g. layer_norm_eps,
    config.layer_norm_eps = config.layer_norm_epsilon
    ) , we could just let the corresponding model definition to read hf_config.layer_norm_epsilon . Noob question for this: why we need to update hf_config.layer_norm_eps in the first place, is somewhere in vllm engine checking layer_norm_eps?
  • For some tricky fields like method (
    model_config.hf_config.method = "from_2_way_softmax"
    ), is_causal (
    hf_config.is_causal = False
    ), I need to take a deeper look for the design.

Appreciate any suggestions!

@noooop
Copy link
Collaborator Author

noooop commented Jan 4, 2026

Moving the discussion from #31632 (comment) to here.

For context, model_arch_config was primarily for explicitly defining which fields will be used by vllm engine, so users can work with an arbitrary hf_config easily. If I understand correctly, verify_and_update_config is for vllm_config and includes cache_config, parallel_config, etc.

My initial thoughts are:

  • We could mainly focus on consolidating the update/verify about the hf_config , while leave the other parts unchanged.

For those logics that update hf_config:

  • move the updates for the fields -- e.g. hidden_size, num_hidden_layers, max_len, rope related (note that I haven't include it into model_arch_config, but that's a plan to do ) etc. - -- that are used by vllm engine into model_arch_config_convertor
  • for fields updates that's only impact the model definition(e.g. layer_norm_eps,
    config.layer_norm_eps = config.layer_norm_epsilon

    ) , we could just let the corresponding model definition to read hf_config.layer_norm_epsilon . Noob question for this: why we need to update hf_config.layer_norm_eps in the first place, is somewhere in vllm engine checking layer_norm_eps?
  • For some tricky fields like method (
    model_config.hf_config.method = "from_2_way_softmax"

    ), is_causal (
    hf_config.is_causal = False

    ), I need to take a deeper look for the design.

Appreciate any suggestions!

Unfortunately, is_causal, hf_config.method, and others were added by me......

I will gradually mv these field into model_arch_config_convertor.py


I'm not sure whether to directly add these fields to ModelArchitectureConfig or to create ModelArchitectureConfigPooling.

@charlotte12l
Copy link
Contributor

Thank you! Let's still use ModelArchitectureConfig but maybe putting all pooling related fields in a class. Which parts from hf_config are important for vllm engine if it's a pooling model, is_causal, hidden_act, pooling? I didn't find the usage of method though.

noooop added 2 commits January 4, 2026 15:00
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
@noooop noooop removed the ready ONLY add when PR is ready to merge/full CI is needed label Jan 4, 2026
@mergify
Copy link

mergify bot commented Jan 4, 2026

Hi @noooop, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
@AndreasKaratzas
Copy link
Collaborator

cc @DarkLight1337

@noooop noooop added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 5, 2026
@noooop noooop merged commit 8be6432 into vllm-project:main Jan 5, 2026
57 checks passed
@noooop noooop deleted the fix_nomic branch January 5, 2026 03:06
LucasWilkinson pushed a commit to neuralmagic/vllm that referenced this pull request Jan 6, 2026
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants