feat: enable EPLB for NVFP4 compressed-tensors ML3 checkpoint by hypdeb · Pull Request #35187 · vllm-project/vllm

hypdeb · 2026-02-24T10:56:51Z

Purpose

Allow enabling eplb for the NVFP4 MoE compressed-tensors path.

Also, fix Mistral Large 3 not being recognized as an MoE model.

In the process, the following additional changes were made:

Use types to make the three possible EPLB state of the router explicit: no state (EPLB is disabled), uninitialized state (EPLB is enabled, but the state is not initialized), initialized state (EPLB is initialized and state is initialized).
Make router the owner of EPLB state
Added missing pytest markers (benchmark, fork)

Test Plan

Testing end-to-end with EPLB enabled.

Test Result

TODO

gemini-code-assist

Code Review

This pull request enables EPLB for NVFP4 compressed tensors and fixes an issue where Mistral Large 3 was not recognized as a Mixture-of-Experts model.

The changes in vllm/model_executor/layers/fused_moe/router/base_router.py to relax EPLB state validation during initialization and in vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors_moe.py to enable EPLB support for NVFP4 are correct and well-implemented.

In vllm/model_executor/models/pixtral.py, the changes correctly proxy the MoE interface from the underlying language model. However, the current implementation of copying attributes is fragile and could lead to stale state. I've suggested a more robust approach using @property decorators to ensure the wrapper always reflects the true state of the language model.

vllm/model_executor/models/pixtral.py

ilmarkov · 2026-02-25T10:55:57Z

vllm/model_executor/layers/fused_moe/router/base_router.py

    def _apply_eplb_mapping(self, topk_ids: torch.Tensor) -> torch.Tensor:
        """Apply EPLB mapping to convert logical expert IDs to physical expert IDs."""
-        if self.enable_eplb:
+        if self.enable_eplb and self._is_eplb_state_ready():


Don't we check the same in _validate_eplb_state? Do the asserts make sense if we do the same readiness check? Maybe we need to return readiness flag in validate and actuall skip EPLB mapping until eplb state is ready

I've adjusted the modelling of EPLB state in the router to be more explicit, which removes the needs for these asserts. There are now three possible states: None, uninitialized, initialized.

Scope of changes is larger. I will test and mark the PR as ready when it's done.

hypdeb · 2026-03-02T13:52:05Z

While testing, discovered that the PR is effectively blocked by: #32564. Without it, I would have to add some hacks.

mergify · 2026-03-04T17:39:51Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @hypdeb.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

gemini-code-assist bot reviewed Feb 24, 2026

View reviewed changes

vllm/model_executor/models/pixtral.py Outdated Show resolved Hide resolved

hypdeb force-pushed the eplb_support_nvfp4_moe branch from 098d2e7 to ecca0b6 Compare February 25, 2026 10:27

ilmarkov reviewed Feb 25, 2026

View reviewed changes

hypdeb force-pushed the eplb_support_nvfp4_moe branch from 979778a to 33a4089 Compare March 1, 2026 13:06

mergify bot added the needs-rebase label Mar 4, 2026

hypdeb force-pushed the eplb_support_nvfp4_moe branch from 58c9344 to 0f64eea Compare March 4, 2026 20:12

mergify bot removed the needs-rebase label Mar 4, 2026

hypdeb force-pushed the eplb_support_nvfp4_moe branch 2 times, most recently from 7150b15 to cc303ce Compare March 5, 2026 14:29

hypdeb closed this Mar 6, 2026

hypdeb force-pushed the eplb_support_nvfp4_moe branch from 0ed8338 to 48e376a Compare March 6, 2026 14:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: enable EPLB for NVFP4 compressed-tensors ML3 checkpoint#35187

feat: enable EPLB for NVFP4 compressed-tensors ML3 checkpoint#35187
hypdeb wants to merge 0 commit intovllm-project:mainfrom
hypdeb:eplb_support_nvfp4_moe

hypdeb commented Feb 24, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

ilmarkov Feb 25, 2026

Uh oh!

hypdeb Mar 1, 2026

Uh oh!

hypdeb Mar 1, 2026

Uh oh!

hypdeb commented Mar 2, 2026

Uh oh!

mergify bot commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

hypdeb commented Feb 24, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

TODO

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

ilmarkov Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

hypdeb Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

hypdeb Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

hypdeb commented Mar 2, 2026

Uh oh!

mergify bot commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hypdeb commented Feb 24, 2026 •

edited by github-actions bot

Loading