Skip to content

feat: enable EPLB for NVFP4 compressed-tensors ML3 checkpoint#35187

Closed
hypdeb wants to merge 0 commit intovllm-project:mainfrom
hypdeb:eplb_support_nvfp4_moe
Closed

feat: enable EPLB for NVFP4 compressed-tensors ML3 checkpoint#35187
hypdeb wants to merge 0 commit intovllm-project:mainfrom
hypdeb:eplb_support_nvfp4_moe

Conversation

@hypdeb
Copy link
Contributor

@hypdeb hypdeb commented Feb 24, 2026

Purpose

Allow enabling eplb for the NVFP4 MoE compressed-tensors path.

Also, fix Mistral Large 3 not being recognized as an MoE model.

In the process, the following additional changes were made:

  • Use types to make the three possible EPLB state of the router explicit: no state (EPLB is disabled), uninitialized state (EPLB is enabled, but the state is not initialized), initialized state (EPLB is initialized and state is initialized).
  • Make router the owner of EPLB state
  • Added missing pytest markers (benchmark, fork)

Test Plan

Testing end-to-end with EPLB enabled.

Test Result

TODO

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables EPLB for NVFP4 compressed tensors and fixes an issue where Mistral Large 3 was not recognized as a Mixture-of-Experts model.

The changes in vllm/model_executor/layers/fused_moe/router/base_router.py to relax EPLB state validation during initialization and in vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors_moe.py to enable EPLB support for NVFP4 are correct and well-implemented.

In vllm/model_executor/models/pixtral.py, the changes correctly proxy the MoE interface from the underlying language model. However, the current implementation of copying attributes is fragile and could lead to stale state. I've suggested a more robust approach using @property decorators to ensure the wrapper always reflects the true state of the language model.

@hypdeb hypdeb force-pushed the eplb_support_nvfp4_moe branch from 098d2e7 to ecca0b6 Compare February 25, 2026 10:27
def _apply_eplb_mapping(self, topk_ids: torch.Tensor) -> torch.Tensor:
"""Apply EPLB mapping to convert logical expert IDs to physical expert IDs."""
if self.enable_eplb:
if self.enable_eplb and self._is_eplb_state_ready():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we check the same in _validate_eplb_state? Do the asserts make sense if we do the same readiness check? Maybe we need to return readiness flag in validate and actuall skip EPLB mapping until eplb state is ready

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've adjusted the modelling of EPLB state in the router to be more explicit, which removes the needs for these asserts. There are now three possible states: None, uninitialized, initialized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scope of changes is larger. I will test and mark the PR as ready when it's done.

@hypdeb hypdeb force-pushed the eplb_support_nvfp4_moe branch from 979778a to 33a4089 Compare March 1, 2026 13:06
@hypdeb
Copy link
Contributor Author

hypdeb commented Mar 2, 2026

While testing, discovered that the PR is effectively blocked by: #32564. Without it, I would have to add some hacks.

@mergify
Copy link

mergify bot commented Mar 4, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @hypdeb.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Mar 4, 2026
@hypdeb hypdeb force-pushed the eplb_support_nvfp4_moe branch from 58c9344 to 0f64eea Compare March 4, 2026 20:12
@mergify mergify bot removed the needs-rebase label Mar 4, 2026
@hypdeb hypdeb force-pushed the eplb_support_nvfp4_moe branch 2 times, most recently from 7150b15 to cc303ce Compare March 5, 2026 14:29
@hypdeb hypdeb closed this Mar 6, 2026
@hypdeb hypdeb force-pushed the eplb_support_nvfp4_moe branch from 0ed8338 to 48e376a Compare March 6, 2026 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants