MLA Based Eagle3 by IzzyPutterman · Pull Request #30574 · vllm-project/vllm

IzzyPutterman · 2025-12-12T21:05:47Z

Purpose

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

chatgpt-codex-connector · 2025-12-12T21:05:55Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

gemini-code-assist

Code Review

This pull request adds support for MLA-based Eagle3 speculative decoding for DeepseekV2/V3 models. The changes include adding a new deepseek_eagle3.py model implementation, updating configurations, and modifying the DeepseekV2 model to support auxiliary hidden state extraction for Eagle3.

I've found a critical issue in vllm/model_executor/models/deepseek_v2.py which includes a syntax error in an enumerate call and a likely logic error where an argument to a layer call was omitted. I've provided a code suggestion to fix this. The rest of the changes look good and correctly integrate the new model.

gemini-code-assist · 2025-12-12T21:07:07Z

+        for idx, layer in enumerate(
+            islice(self.layers, self.start_layer, self.end_layer)
+            start=self.start_layer,
+        ):
+            if idx in self.aux_hidden_state_layers:
+                aux_hidden_states.append(hidden_states + residual)
+            hidden_states, residual = layer(positions, hidden_states, residual)


There are a couple of issues in this loop:

There's a syntax error in the enumerate call. The start argument is misplaced and there's a missing comma. It should be enumerate(islice(...), start=self.start_layer).

The llama_4_scaling argument is no longer passed to the layer call, but the logic to compute it is still present. This seems like an accidental omission and could lead to incorrect behavior.

I've provided a suggestion to fix both issues.

Suggested change

for idx, layer in enumerate(

islice(self.layers, self.start_layer, self.end_layer)

start=self.start_layer,

):

if idx in self.aux_hidden_state_layers:

aux_hidden_states.append(hidden_states + residual)

hidden_states, residual = layer(positions, hidden_states, residual)

for idx, layer in enumerate(

islice(self.layers, self.start_layer, self.end_layer),

start=self.start_layer,

):

if idx in self.aux_hidden_state_layers:

aux_hidden_states.append(hidden_states + residual)

hidden_states, residual = layer(positions, hidden_states, residual,

llama_4_scaling)

mergify · 2025-12-12T21:08:54Z

Hi @IzzyPutterman, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

mgoin · 2025-12-14T22:05:05Z

Hi @IzzyPutterman do you have a model or usecase where this is useful? It would be helpful for review if you filled out the description and testing you did

IzzyPutterman · 2025-12-19T02:40:36Z

Hi @IzzyPutterman do you have a model or usecase where this is useful? It would be helpful for review if you filled out the description and testing you did

This allows for Eagles that share MLA instead of GQA for attention, so one can train Eagle3s for Kimi and Deepseek and use them across TRTLLM, SGL, and vLLM. We have a Kimi K2 thinking Eagle3 internally, which might get released (not sure here).

mergify · 2025-12-26T04:42:10Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @IzzyPutterman.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Izzy Putterman <iputterman@nvidia.com>

IzzyPutterman requested review from ProExpertProg, WoosukKwon, benchislett, hmellor, houseroad, luccafong, mgoin, robertgshaw2-redhat, tlrmchlsmth, yewentao256 and youkaichao as code owners December 12, 2025 21:05

mergify Bot added deepseek Related to DeepSeek models new-model Requests to new models speculative-decoding v1 labels Dec 12, 2025

gemini-code-assist Bot reviewed Dec 12, 2025

View reviewed changes

IzzyPutterman force-pushed the iputterman/ds-eagle3-main branch from 1c61f04 to f39721e Compare December 12, 2025 22:50

IzzyPutterman force-pushed the iputterman/ds-eagle3-main branch from f39721e to b704ffb Compare December 20, 2025 00:19

mergify Bot added the needs-rebase label Dec 26, 2025

Izzy Putterman added 2 commits December 31, 2025 11:39

DeepSeek Eagle3

6b5659e

Signed-off-by: Izzy Putterman <iputterman@nvidia.com>

Typo

0c838b5

Signed-off-by: Izzy Putterman <iputterman@nvidia.com>

IzzyPutterman force-pushed the iputterman/ds-eagle3-main branch from b704ffb to 0c838b5 Compare December 31, 2025 19:40

mergify Bot removed the needs-rebase label Dec 31, 2025

IzzyPutterman mentioned this pull request Mar 10, 2026

Kimi k2.5 MLA based eagle3 #36361

Merged

5 tasks

IzzyPutterman closed this Mar 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MLA Based Eagle3#30574

MLA Based Eagle3#30574
IzzyPutterman wants to merge 2 commits intovllm-project:mainfrom
IzzyPutterman:iputterman/ds-eagle3-main

IzzyPutterman commented Dec 12, 2025 •

edited by github-actions Bot

Loading

Uh oh!

chatgpt-codex-connector Bot commented Dec 12, 2025

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Dec 12, 2025

Uh oh!

mergify Bot commented Dec 12, 2025

Uh oh!

mgoin commented Dec 14, 2025

Uh oh!

IzzyPutterman commented Dec 19, 2025

Uh oh!

mergify Bot commented Dec 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

IzzyPutterman commented Dec 12, 2025 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector Bot commented Dec 12, 2025

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

mergify Bot commented Dec 12, 2025

Uh oh!

mgoin commented Dec 14, 2025

Uh oh!

IzzyPutterman commented Dec 19, 2025

Uh oh!

mergify Bot commented Dec 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

IzzyPutterman commented Dec 12, 2025 •

edited by github-actions Bot

Loading