Skip to content

[BugFix]Fix eagle draft_model_config and add tests#31753

Merged
heheda12345 merged 7 commits intovllm-project:mainfrom
charlotte12l:fix_eagle
Jan 13, 2026
Merged

[BugFix]Fix eagle draft_model_config and add tests#31753
heheda12345 merged 7 commits intovllm-project:mainfrom
charlotte12l:fix_eagle

Conversation

@charlotte12l
Copy link
Contributor

@charlotte12l charlotte12l commented Jan 5, 2026

Purpose

EAGLE models uses EAGLEConfig to update hf_config architectures in draft_model_config. However, draft_model_config 's model_info and resolved architecture are not updated. Therefore, for eagle draft models, hf_config.architectures is EagleLlamaForCausalLM but draft_model_config.architecture is LlamaForCausalLM.

eagle_config = EAGLEConfig(
self.draft_model_config.hf_config,
method=self.method,
model_type="eagle",
)
self.draft_model_config.hf_config = eagle_config
self.draft_model_config.model_arch_config = (
self.draft_model_config.get_model_arch_config()
)

This PR updates all architectures-related fields in draft_model_config.
For some fields like runner_type, convert_type, we didn't update them because although they are depend on architectures, the dependency only holds if runner_type=auto. However draft models are using runner_type=draft.

Test Plan

python -m pytest tests/test_config.py::test_eagle_draft_model_config

Test Result

passed


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
@charlotte12l charlotte12l changed the title [BugFix]Fix eagle draft_model_config [BugFix]Fix eagle draft_model_config and add tests Jan 5, 2026
@charlotte12l charlotte12l marked this pull request as ready for review January 5, 2026 23:10
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a bug where updating the hf_config for an EAGLE draft model did not correctly propagate to other architecture-related fields in the draft_model_config. The changes correctly update hf_text_config, model_arch_config, _model_info, and _architecture to be consistent. A new test case is also added to verify this fix. My review includes one suggestion to refactor the implementation for better maintainability by encapsulating the re-initialization logic within the ModelConfig class, which would make the code less fragile to future changes.

Comment on lines +404 to +420
# EAGLEConfig primarily updates architectures, so update
# all architectures-related fields in draft_model_config
self.draft_model_config.hf_config = eagle_config
self.draft_model_config.hf_text_config = get_hf_text_config(
self.draft_model_config.hf_config
)
self.draft_model_config.model_arch_config = (
self.draft_model_config.get_model_arch_config()
)
model_info, arch = (
self.draft_model_config.registry.inspect_model_cls(
self.draft_model_config.architectures,
self.draft_model_config,
)
)
self.draft_model_config._model_info = model_info
self.draft_model_config._architecture = arch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While this correctly fixes the issue, manually re-running parts of ModelConfig.__post_init__ here makes the code fragile. If the initialization logic in ModelConfig changes in the future, this code might become outdated and introduce subtle bugs.

To improve maintainability and encapsulation, I suggest adding a private helper method to ModelConfig to reset these architecture-dependent fields. This would also avoid accessing private members like _model_info and _architecture from outside the class.

You could add the following method to vllm/config/model.py (this file is not in the PR, so it would be an expansion of scope, but would improve the codebase):

# In vllm/config/model.py, class ModelConfig:
def _reset_architecture_fields(self):
    """
    Resets fields that are derived from the model architecture.
    This is useful when hf_config.architectures is modified after
    initialization.
    """
    from vllm.transformers_utils.config import get_hf_text_config
    self.hf_text_config = get_hf_text_config(self.hf_config)
    self.model_arch_config = self.get_model_arch_config()
    model_info, arch = self.registry.inspect_model_cls(
        self.architectures,
        self,
    )
    self._model_info = model_info
    self._architecture = arch

Then, you can simplify the code here as follows:

                        # EAGLEConfig primarily updates architectures, so update
                        # all architectures-related fields in draft_model_config
                        self.draft_model_config.hf_config = eagle_config
                        self.draft_model_config._reset_architecture_fields()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since _reset_architecture_fields was only used by eagle draft models, currently I keep those logics inside speculative.py. Happy to change if reviewer also agree with gemini

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think gemini's solution can solve the problem that people may forget to update this code. So I'm OK with current implementation.

)

# Replace hf_config for EAGLE draft_model
if self.method in ("eagle", "eagle3"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does MTP have the same problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only saw eagle models updating hf_config, while MTP models didn't. But if it's MTP+EAGLE, like eagle618/eagle-deepseek-v3-random , then it will have the problem.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luccafong why don't we update the config for MTP?

Copy link
Collaborator

@luccafong luccafong Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heheda12345 AFAIK, since MTP reuses the same ckpt of target model and same hf config, so there's no need for a separate hf config (which most eagle models have)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Then I think this implementation is good.

heheda12345
heheda12345 approved these changes Jan 8, 2026
@heheda12345 heheda12345 enabled auto-merge (squash) January 8, 2026 05:35
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 8, 2026
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
auto-merge was automatically disabled January 8, 2026 19:23

Head branch was pushed to by a user without write access

charlotte12l and others added 2 commits January 8, 2026 11:25
@charlotte12l
Copy link
Contributor Author

charlotte12l commented Jan 8, 2026

After updating draft_model_config.hf_text_config accordingly in current PR, it triggers another buggy code:

  • Previously(before this PR), the below code will never be called as hf_text_config will never be eagle
    elif self.hf_text_config.model_type == "eagle":
    # if the model is an EAGLE module, check for the
    # underlying architecture
    return (
    self.hf_text_config.model.model_type
    in ("deepseek_v2", "deepseek_v3", "deepseek_v32")
    and self.hf_text_config.kv_lora_rank is not None
    )
    return False
    . For eagle618/eagle-deepseek-v3-random, is_deepseek_mla is True.
  • After this PR, for eagle618/eagle-deepseek-v3-random, is_deepseek_mla will be False as self.hf_text_config.model.model_type is deepseek_mtp.
  • My solution is updating ("deepseek_v2", "deepseek_v3", "deepseek_v32") to ("deepseek_v2", "deepseek_v3", "deepseek_v32", "deepseek_mtp"), let me know if this make sense to you @luccafong @heheda12345

Another weird PretrainedConfig issue I encountered during debugging the above:

  • At the above code, I do print(f"{self.hf_text_config.model=}") and print(f"{self.hf_text_config.model.model_type=}").
  • I found model_type in print(f"{self.hf_text_config.model=}") is deepseek_v3 while print(f"{self.hf_text_config.model.model_type=}") is deepseek_mtp.
  • I'm suspecting maybe __repr__ is not updating model_type correctly, is this expected? @hmellor

@heheda12345 heheda12345 merged commit 80221e1 into vllm-project:main Jan 13, 2026
48 of 49 checks passed
@heheda12345
Copy link
Collaborator

@hmellor

I'm suspecting maybe repr is not updating model_type correctly, is this expected?

Can you help to take a look? I merged this PR and hope @charlotte12l can help to fix this in a follow-up PR.

TomerBN-Nvidia pushed a commit to TomerBN-Nvidia/vllm that referenced this pull request Jan 13, 2026
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Signed-off-by: Tomer Natan <tbarnatan@computelab-frontend-8.nvidia.com>
@charlotte12l
Copy link
Contributor Author

charlotte12l commented Jan 13, 2026

@heheda12345 @hmellor I might find the root cause but not sure if it's intended by huggingface:

This line explicitly uses self.class.model_type (the class attribute) when serializing, so it won't use the instance attribute model_type (where speculative decoding often override

if hf_config.model_type in ("deepseek_v3", "deepseek_v32"):
hf_config.model_type = "deepseek_mtp"
if hf_config.model_type == "deepseek_mtp":
n_predict = getattr(hf_config, "num_nextn_predict_layers", None)
hf_config.update(
{"n_predict": n_predict, "architectures": ["DeepSeekMTPModel"]}
)
if hf_config.model_type in ("pangu_ultra_moe"):
hf_config.model_type = "pangu_ultra_moe_mtp"
)

https://github.com/huggingface/transformers/blob/v5.0.0rc2/src/transformers/configuration_utils.py#L972

sammysun0711 pushed a commit to sammysun0711/vllm that referenced this pull request Jan 16, 2026
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants