Skip to content

MLA Based Eagle3#30574

Closed
IzzyPutterman wants to merge 2 commits intovllm-project:mainfrom
IzzyPutterman:iputterman/ds-eagle3-main
Closed

MLA Based Eagle3#30574
IzzyPutterman wants to merge 2 commits intovllm-project:mainfrom
IzzyPutterman:iputterman/ds-eagle3-main

Conversation

@IzzyPutterman
Copy link
Copy Markdown
Contributor

@IzzyPutterman IzzyPutterman commented Dec 12, 2025

Purpose

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

@mergify mergify Bot added deepseek Related to DeepSeek models new-model Requests to new models speculative-decoding v1 labels Dec 12, 2025
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for MLA-based Eagle3 speculative decoding for DeepseekV2/V3 models. The changes include adding a new deepseek_eagle3.py model implementation, updating configurations, and modifying the DeepseekV2 model to support auxiliary hidden state extraction for Eagle3.

I've found a critical issue in vllm/model_executor/models/deepseek_v2.py which includes a syntax error in an enumerate call and a likely logic error where an argument to a layer call was omitted. I've provided a code suggestion to fix this. The rest of the changes look good and correctly integrate the new model.

Comment on lines +1321 to +1327
for idx, layer in enumerate(
islice(self.layers, self.start_layer, self.end_layer)
start=self.start_layer,
):
if idx in self.aux_hidden_state_layers:
aux_hidden_states.append(hidden_states + residual)
hidden_states, residual = layer(positions, hidden_states, residual)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There are a couple of issues in this loop:

  1. There's a syntax error in the enumerate call. The start argument is misplaced and there's a missing comma. It should be enumerate(islice(...), start=self.start_layer).
  2. The llama_4_scaling argument is no longer passed to the layer call, but the logic to compute it is still present. This seems like an accidental omission and could lead to incorrect behavior.

I've provided a suggestion to fix both issues.

Suggested change
for idx, layer in enumerate(
islice(self.layers, self.start_layer, self.end_layer)
start=self.start_layer,
):
if idx in self.aux_hidden_state_layers:
aux_hidden_states.append(hidden_states + residual)
hidden_states, residual = layer(positions, hidden_states, residual)
for idx, layer in enumerate(
islice(self.layers, self.start_layer, self.end_layer),
start=self.start_layer,
):
if idx in self.aux_hidden_state_layers:
aux_hidden_states.append(hidden_states + residual)
hidden_states, residual = layer(positions, hidden_states, residual,
llama_4_scaling)

@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Dec 12, 2025

Hi @IzzyPutterman, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

@IzzyPutterman IzzyPutterman force-pushed the iputterman/ds-eagle3-main branch from 1c61f04 to f39721e Compare December 12, 2025 22:50
@mgoin
Copy link
Copy Markdown
Member

mgoin commented Dec 14, 2025

Hi @IzzyPutterman do you have a model or usecase where this is useful? It would be helpful for review if you filled out the description and testing you did

@IzzyPutterman
Copy link
Copy Markdown
Contributor Author

Hi @IzzyPutterman do you have a model or usecase where this is useful? It would be helpful for review if you filled out the description and testing you did

This allows for Eagles that share MLA instead of GQA for attention, so one can train Eagle3s for Kimi and Deepseek and use them across TRTLLM, SGL, and vLLM. We have a Kimi K2 thinking Eagle3 internally, which might get released (not sure here).

@IzzyPutterman IzzyPutterman force-pushed the iputterman/ds-eagle3-main branch from f39721e to b704ffb Compare December 20, 2025 00:19
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Dec 26, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @IzzyPutterman.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify Bot added the needs-rebase label Dec 26, 2025
Izzy Putterman added 2 commits December 31, 2025 11:39
Signed-off-by: Izzy Putterman <iputterman@nvidia.com>
Signed-off-by: Izzy Putterman <iputterman@nvidia.com>
@IzzyPutterman IzzyPutterman force-pushed the iputterman/ds-eagle3-main branch from b704ffb to 0c838b5 Compare December 31, 2025 19:40
@mergify mergify Bot removed the needs-rebase label Dec 31, 2025
@IzzyPutterman IzzyPutterman mentioned this pull request Mar 10, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deepseek Related to DeepSeek models new-model Requests to new models speculative-decoding v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants