[CI Failure] Fix NomicBert max_model_len validation#31662
[CI Failure] Fix NomicBert max_model_len validation#31662noooop merged 7 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request correctly fixes a bug in the max_model_len validation for NomicBert models, particularly when context extension is used. The core of the fix involves updating the cached derived_max_model_len before validation, which is a crucial and correct change. Additionally, the PR includes a nice refactoring by removing the VllmConfig.recalculate_max_model_len method and adjusting the NomicBertModelConfig method signature to reduce dependencies. The changes are sound, improve code clarity, and I have no further comments.
|
The functionality of model_arch_config_convertor and verify_and_update_model_config is quite similar. Using verify_and_update_model_config can help avoid placing the logic for NomicBertModel in two separate locations. |
|
Moving the discussion from #31632 (comment) to here. For context, model_arch_config was primarily for explicitly defining which fields will be used by vllm engine, so users can work with an arbitrary hf_config easily. My initial thoughts are:
For those logics that update
Appreciate any suggestions! |
Unfortunately, is_causal, hf_config.method, and others were added by me...... I will gradually mv these field into model_arch_config_convertor.py I'm not sure whether to directly add these fields to ModelArchitectureConfig or to create ModelArchitectureConfigPooling. |
|
Thank you! Let's still use |
|
Hi @noooop, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Purpose
Fix NomicBert max_model_len validation.
rm VllmConfig.recalculate_max_model_len.
It was introduced in #18755 and is only used in the NomicBert max_model_len validation. Deleting it prevents potential misuse.
Adapted from #31632
Test Plan
tests/models/language/pooling_mteb_test/test_nomic.py
Test Result
pass
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.