Fix: Missing args in allreduce_fusion MOE finalize call by samuellees · Pull Request #3046 · flashinfer-ai/flashinfer

samuellees · 2026-04-13T14:40:54Z

Problem

On main branch, allreduce_fusion(pattern=kMoEFinalizeARResidualRMSNorm) crashes with TypeError: missing positional arguments. mypy pre-commit also fails.

Root Cause

Two PRs merged to main in sequence, modifying different ends of the same call chain:

Order	PR	What changed	File
1st	#2966 (Fused moe all-reduce routed scaling factor + quant support)	Added `quant_out`, `scale_out`, `routed_scaling_factor` to `trtllm_moe_finalize_allreduce_fusion()` signature	`flashinfer/comm/trtllm_ar.py`
2nd	#2982 (Add MOE patterns to unified allreduce_fusion API, closes #2823)	Added `kMoEFinalizeARResidualRMSNorm` pattern that calls `trtllm_moe_finalize_allreduce_fusion()`	`flashinfer/comm/allreduce.py`

PR #2982 was developed before #2966 merged. Git merge produced no conflict since they touched different files, but the call in allreduce_fusion() was left with the old signature — missing the three new positional args added by #2966.

Impact

Scope	Status
`allreduce_fusion(pattern=kMoEFinalizeARResidualRMSNorm)` (pattern 7)	Broken — TypeError at runtime
`allreduce_fusion(pattern=kMoEReductionARResidualRMSNorm)` (pattern 6)	Not affected (calls a different function)
`allreduce_fusion(pattern=0-5)` (standard allreduce)	Not affected
`trtllm_moe_finalize_allreduce_fusion()` direct callers	Not affected (low-level API is correct)
mypy pre-commit	Fails
`test_allreduce_fusion_moe_unified_api.py` finalize tests	Would fail if run on multi-GPU CI

Practical impact is limited — pattern 7 was just added in #2982 and has no downstream consumers yet.

Fix

flashinfer/comm/allreduce.py:

Pass quant_out, scale_out, routed_scaling_factor to the trtllm_moe_finalize_allreduce_fusion() call
Add routed_scaling_factor: Optional[float] = None to allreduce_fusion() signature
Update docstring

No test changes needed — existing tests pass None for the new args (default values), which is the correct behavior for non-quantized finalize.

PR

Branch: fix/moe-finalize-missing-args
Changes: 1 file, +5 lines
Title: fix: add missing args to moe_finalize call in unified allreduce_fusion API

Summary by CodeRabbit

New Features
- Enhanced Mixture of Experts finalization for TRTLLM backend by introducing a configurable global scaling parameter. This enables fine-grained control over routed expert output scaling during finalization operations, improving flexibility for advanced inference configurations.

…n API PR flashinfer-ai#2966 added quant_out, scale_out, and routed_scaling_factor params to trtllm_moe_finalize_allreduce_fusion(). PR flashinfer-ai#2982 (unified API) was developed before flashinfer-ai#2966 merged, and git merge produced no conflict since they touched different files (trtllm_ar.py vs allreduce.py). However the call in allreduce_fusion() was missing the three new positional args, causing TypeError at runtime for kMoEFinalizeARResidualRMSNorm pattern and mypy failure in pre-commit. Fix: - Add quant_out, scale_out, routed_scaling_factor to the finalize call - Add routed_scaling_factor to allreduce_fusion() function signature - Update docstring AI-assisted Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-13T14:42:02Z

Caution

Review failed

Pull request was closed or merged during review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d40714a4-e5d1-4bc3-b333-8555b87f4031

📥 Commits

Reviewing files that changed from the base of the PR and between e64ae8b and 8610120.

📒 Files selected for processing (1)

flashinfer/comm/allreduce.py

📝 Walkthrough

Walkthrough

The PR adds an optional routed_scaling_factor parameter to the allreduce_fusion function, enabling configurable global scaling for routed expert outputs in MoE finalize operations. The TRTLLM backend MOE finalize dispatch path now forwards this parameter instead of passing a hardcoded None.

Changes

Cohort / File(s)	Summary
MoE Finalize AllReduce Scaling `flashinfer/comm/allreduce.py`	Added optional `routed_scaling_factor` parameter to `allreduce_fusion` signature and propagated it to the `trtllm_moe_finalize_allreduce_fusion` call in the MOE finalize dispatch path, replacing hardcoded `None`.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

#3040 — Adds missing routed_scaling_factor alongside quant_out/scale_out arguments to the same TRTLLM MoE finalize callsite in allreduce.py.
#2982 — Modifies allreduce_fusion API and TRTLLM MoE finalize dispatch to introduce and wire routed_scaling_factor parameter handling.
#2966 — Modifies MoE finalize all-reduce fusion path to add and propagate routed_scaling_factor parameter across function signatures.

Suggested reviewers

aleozlx
yzh119
bkryu
jimmyzho
nv-yunzheq

Poem

🐰 A scaling factor hops through the code,
Making MoE experts lighter their load,
No more hardcoded None in sight,
Just a parameter, perfectly right! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly and concisely identifies the specific problem being fixed: missing arguments in a MOE finalize call within the allreduce_fusion function.
Description check	✅ Passed	The description comprehensively covers the problem statement, root cause analysis, impact assessment, and implemented fix with clear context about prior PRs.
Linked Issues check	✅ Passed	The changes correctly implement the MOE finalize pattern from `#2823` by fixing the call signature to trtllm_moe_finalize_allreduce_fusion() and exposing routed_scaling_factor parameter.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to fixing the MOE finalize call signature and exposing the routed_scaling_factor parameter, with no unrelated modifications.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Timed out fetching pipeline failures after 30000ms

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…ing-args # Conflicts: # flashinfer/comm/allreduce.py

gemini-code-assist

Code Review

This pull request updates the allreduce_fusion function in flashinfer/comm/allreduce.py to include a new routed_scaling_factor parameter and its corresponding documentation. Additionally, it updates the internal operation call to include quant_out, scale_out, and routed_scaling_factor. I have no feedback to provide.

samuellees · 2026-04-13T14:47:54Z

Close because of #3040

samuellees requested review from aleozlx, bkryu, jimmyzho, nv-yunzheq and yzh119 as code owners April 13, 2026 14:40

flashinfer-bot added the op: comm label Apr 13, 2026

Merge remote-tracking branch 'origin/main' into fix/moe-finalize-miss…

8610120

…ing-args # Conflicts: # flashinfer/comm/allreduce.py

gemini-code-assist bot reviewed Apr 13, 2026

View reviewed changes

samuellees closed this Apr 13, 2026

samuellees deleted the fix/moe-finalize-missing-args branch April 13, 2026 14:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Missing args in allreduce_fusion MOE finalize call#3046

Fix: Missing args in allreduce_fusion MOE finalize call#3046
samuellees wants to merge 2 commits intoflashinfer-ai:mainfrom
samuellees:fix/moe-finalize-missing-args

samuellees commented Apr 13, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 13, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Review ran into problems

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

samuellees commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

samuellees commented Apr 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root Cause

Impact

Fix

PR

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Review ran into problems

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

samuellees commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

samuellees commented Apr 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 13, 2026 •

edited

Loading