[BugFix]Fix eagle draft_model_config and add tests#31753
[BugFix]Fix eagle draft_model_config and add tests#31753heheda12345 merged 7 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request fixes a bug where updating the hf_config for an EAGLE draft model did not correctly propagate to other architecture-related fields in the draft_model_config. The changes correctly update hf_text_config, model_arch_config, _model_info, and _architecture to be consistent. A new test case is also added to verify this fix. My review includes one suggestion to refactor the implementation for better maintainability by encapsulating the re-initialization logic within the ModelConfig class, which would make the code less fragile to future changes.
| # EAGLEConfig primarily updates architectures, so update | ||
| # all architectures-related fields in draft_model_config | ||
| self.draft_model_config.hf_config = eagle_config | ||
| self.draft_model_config.hf_text_config = get_hf_text_config( | ||
| self.draft_model_config.hf_config | ||
| ) | ||
| self.draft_model_config.model_arch_config = ( | ||
| self.draft_model_config.get_model_arch_config() | ||
| ) | ||
| model_info, arch = ( | ||
| self.draft_model_config.registry.inspect_model_cls( | ||
| self.draft_model_config.architectures, | ||
| self.draft_model_config, | ||
| ) | ||
| ) | ||
| self.draft_model_config._model_info = model_info | ||
| self.draft_model_config._architecture = arch |
There was a problem hiding this comment.
While this correctly fixes the issue, manually re-running parts of ModelConfig.__post_init__ here makes the code fragile. If the initialization logic in ModelConfig changes in the future, this code might become outdated and introduce subtle bugs.
To improve maintainability and encapsulation, I suggest adding a private helper method to ModelConfig to reset these architecture-dependent fields. This would also avoid accessing private members like _model_info and _architecture from outside the class.
You could add the following method to vllm/config/model.py (this file is not in the PR, so it would be an expansion of scope, but would improve the codebase):
# In vllm/config/model.py, class ModelConfig:
def _reset_architecture_fields(self):
"""
Resets fields that are derived from the model architecture.
This is useful when hf_config.architectures is modified after
initialization.
"""
from vllm.transformers_utils.config import get_hf_text_config
self.hf_text_config = get_hf_text_config(self.hf_config)
self.model_arch_config = self.get_model_arch_config()
model_info, arch = self.registry.inspect_model_cls(
self.architectures,
self,
)
self._model_info = model_info
self._architecture = archThen, you can simplify the code here as follows:
# EAGLEConfig primarily updates architectures, so update
# all architectures-related fields in draft_model_config
self.draft_model_config.hf_config = eagle_config
self.draft_model_config._reset_architecture_fields()There was a problem hiding this comment.
Since _reset_architecture_fields was only used by eagle draft models, currently I keep those logics inside speculative.py. Happy to change if reviewer also agree with gemini
There was a problem hiding this comment.
I don't think gemini's solution can solve the problem that people may forget to update this code. So I'm OK with current implementation.
| ) | ||
|
|
||
| # Replace hf_config for EAGLE draft_model | ||
| if self.method in ("eagle", "eagle3"): |
There was a problem hiding this comment.
does MTP have the same problem?
There was a problem hiding this comment.
I only saw eagle models updating hf_config, while MTP models didn't. But if it's MTP+EAGLE, like eagle618/eagle-deepseek-v3-random , then it will have the problem.
There was a problem hiding this comment.
@luccafong why don't we update the config for MTP?
There was a problem hiding this comment.
@heheda12345 AFAIK, since MTP reuses the same ckpt of target model and same hf config, so there's no need for a separate hf config (which most eagle models have)
There was a problem hiding this comment.
Thanks! Then I think this implementation is good.
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Head branch was pushed to by a user without write access
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
|
After updating
Another weird PretrainedConfig issue I encountered during debugging the above:
|
Can you help to take a look? I merged this PR and hope @charlotte12l can help to fix this in a follow-up PR. |
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com> Signed-off-by: Tomer Natan <tbarnatan@computelab-frontend-8.nvidia.com>
|
@heheda12345 @hmellor I might find the root cause but not sure if it's intended by huggingface: This line explicitly uses self.class.model_type (the class attribute) when serializing, so it won't use the instance attribute model_type (where speculative decoding often override vllm/vllm/config/speculative.py Lines 171 to 179 in 7508243 |
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Purpose
EAGLE models uses
EAGLEConfigto update hf_configarchitecturesindraft_model_config. However,draft_model_config'smodel_infoand resolvedarchitectureare not updated. Therefore, for eagle draft models,hf_config.architecturesisEagleLlamaForCausalLMbutdraft_model_config.architectureisLlamaForCausalLM.vllm/vllm/config/speculative.py
Lines 398 to 406 in 0d4044e
This PR updates all architectures-related fields in
draft_model_config.For some fields like
runner_type,convert_type, we didn't update them because although they are depend onarchitectures, the dependency only holds ifrunner_type=auto. However draft models are usingrunner_type=draft.Test Plan
python -m pytest tests/test_config.py::test_eagle_draft_model_config
Test Result
passed
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.