add gpu correctness tests for scattermoe-lora by winglian · Pull Request #3474 · axolotl-ai-cloud/axolotl

winglian · 2026-03-07T04:13:49Z

Description

Motivation and Context

How has this been tested?

AI Usage Disclaimer

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

Summary by CodeRabbit

Tests
- Added comprehensive test suite for ScatterMoE LoRA fused kernel implementations, covering forward/backward passes, gradient calculations, numerical stability across precision variants, and autograd integration with edge cases
- Added integration tests validating LoRA adapter compatibility with OLMoE models, including layout conversions, kernel optimization paths, and gradient correctness validation

coderabbitai · 2026-03-07T04:14:01Z

📝 Walkthrough

Walkthrough

Introduces two comprehensive integration test suites for ScatterMoE LoRA fused kernels: one validating forward and backward paths across dimensions, data types, and configurations; another testing ScatterMoE integration with OLMoE models and peft LoRA adapters, including layout conversions and kernelized execution paths.

Changes

Cohort / File(s)	Summary
ScatterMoE LoRA Kernels Tests `tests/e2e/integrations/test_scattermoe_lora_kernels.py`	Comprehensive test suite with reference implementations for routing, forward, and backward computations. Includes forward pass validation, LoRA gradient calculations, autograd integration, edge cases (single expert, empty experts), fused kernel testing (dX, gather), token rounding utilities, and combined optimization validation across multiple dimensions, data types (fp32, bf16, fp16), and grouped/ungrouped configurations.
ScatterMoE LoRA OLMoE Integration Tests `tests/e2e/integrations/test_scattermoe_lora_olmoe.py`	Integration tests for ScatterMoE with OLMoE models and peft LoRA adapters, covering layout conversions (peft rank-major to expert-major), forward equivalence validation, LoRA delta computations, CUDA kernel execution via LocalLayerRepository, kernelized forward comparisons, backward gradient consistency, and shared-expert handling with fallback behavior for CPU-only environments.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

ready to merge

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.61% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'add gpu correctness tests for scattermoe-lora' is clear, specific, and accurately summarizes the main change—introducing comprehensive GPU correctness tests for ScatterMoE with LoRA fused kernels.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch moe-lora-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (5)

tests/e2e/integrations/test_scattermoe_lora_kernels.py (4)

835-836: Minor: Unused expert_offsets in edge case tests.

The variable is returned but not used. Consider using underscore prefix.

♻️ Suggested fix

-        sorted_expert_idxs, sorted_scattered_idxs, expert_offsets = (
+        sorted_expert_idxs, sorted_scattered_idxs, _expert_offsets = (
             flatten_sort_count_ref(selected_experts, E)
         )

Apply same change to line 879.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/integrations/test_scattermoe_lora_kernels.py` around lines 835 -
836, The tuple returned by flatten_sort_count_ref assigns expert_offsets which
is never used; rename that element to have an underscore prefix (e.g.,
_expert_offsets) wherever the tuple is destructured (the occurrence assigning
sorted_expert_idxs, sorted_scattered_idxs, expert_offsets and the similar
destructuring at the later occurrence around line 879) so the unused variable is
clearly marked as intentional; update both destructuring sites referencing
flatten_sort_count_ref to use _expert_offsets instead of expert_offsets.

74-74: Minor: Unused variable K in unpacking.

The variable K is unpacked but never used in the function. Consider using underscore prefix.

♻️ Suggested fix

-    E, K, N = W.shape
+    E, _K, N = W.shape

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/integrations/test_scattermoe_lora_kernels.py` at line 74, The
unpacking currently binds an unused variable `K` from `W.shape`; change the
unpacking in the test (where `E, K, N = W.shape` appears) to use an
underscore-prefixed name like `E, _K, N = W.shape` (or `E, _, N = W.shape`) so
linters and readers know the middle dimension is intentionally unused; update
any references to `K` if they exist (there should be none).

537-537: Minor: Unused unpacked variables ref_dA and ref_dB.

These are returned by reference_lora_backward but not used since this test only verifies input gradients.

♻️ Suggested fix

-        ref_dX, ref_dA, ref_dB = reference_lora_backward(
+        ref_dX, _ref_dA, _ref_dB = reference_lora_backward(

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/integrations/test_scattermoe_lora_kernels.py` at line 537, The test
unpacks three values from reference_lora_backward into ref_dX, ref_dA, ref_dB
but only uses ref_dX; update the unpacking to avoid unused variables by either
assigning only the first return (e.g., ref_dX = reference_lora_backward(...)[0])
or using Python throwaway names (e.g., ref_dX, _, _ =
reference_lora_backward(...)) so ref_dA and ref_dB are not left unused; ensure
the call site remains reference_lora_backward(...) and that ref_dX is used
unchanged.

1281-1289: Minor: Unused unpacked variables in token rounding tests.

padded_ei and padded_si are returned but only padded_offsets and real_offsets are verified in this test.

♻️ Suggested fix

-        padded_ei, padded_si, padded_offsets, real_offsets = (
+        _padded_ei, _padded_si, padded_offsets, real_offsets = (
             lora_ops.round_expert_counts(

Apply same pattern to lines 1344 and 1386.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/integrations/test_scattermoe_lora_kernels.py` around lines 1281 -
1289, The test unpacks padded_ei and padded_si from lora_ops.round_expert_counts
but never uses them; update the unpacking in the test to ignore those values
(e.g., use _ or _padded_ei/_padded_si) when calling lora_ops.round_expert_counts
so only padded_offsets and real_offsets are kept, and apply the same change to
the other similar unpack sites referenced (the calls around the locations noted,
e.g., the other calls to lora_ops.round_expert_counts at the later test blocks).

tests/e2e/integrations/test_scattermoe_lora_olmoe.py (1)

494-505: Consider using underscore prefix for intentionally unused unpacked variables.

The gup_s and down_s scaling values are unpacked but not used in these shape verification tests.

♻️ Suggested fix

         E, r = config.num_experts, 4
-        gup_A, gup_B, gup_s = gup_lora
+        gup_A, gup_B, _gup_s = gup_lora
         assert gup_A.shape == (E * r, config.hidden_size), (
...
         # down_proj W = param.T = [E, inter, hidden], K=inter, N=hidden
-        down_A, down_B, down_s = down_lora
+        down_A, down_B, _down_s = down_lora

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/integrations/test_scattermoe_lora_olmoe.py` around lines 494 - 505,
The test unpacks scaling variables gup_s and down_s but never uses them; change
those unpacked names to use an underscore prefix (e.g., _gup_s and _down_s) to
indicate intentional unused variables in the assertions around gup_A, gup_B,
down_A, down_B; update the unpacking lines where gup_lora and down_lora are
destructured so the unused values are named _gup_s and _down_s respectively
(leaving the rest of the assertions unchanged).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/e2e/integrations/test_scattermoe_lora_olmoe.py`:
- Around line 1221-1229: The repo_path calculation in
test_scattermoe_lora_olmoe.py duplicates the incorrect relative navigation used
by _get_repo_path() (it builds tests/src/axolotl/... instead of
<repo_root>/src/axolotl/...), so replace the inline Path(...) construction with
a call to the shared helper (e.g. _get_repo_path or a newly exported
get_repo_root helper) and update tests to use that helper; ensure the helper
returns the repository root (not relative to the tests folder) and then append
/"src"/"axolotl"/"integrations"/"kernels"/"libs"/"scattermoe_lora" to that root
to compute repo_path.
- Around line 932-943: The _get_repo_path() helper builds an incorrect path by
stopping at parent.parent.parent (which lands in tests/) then appending "src",
causing tests/src/... to be sought; update the path traversal in
_get_repo_path() to ascend one more level (use parent.parent.parent.parent) so
the constructed path points to the repository root and resolves to
src/axolotl/integrations/kernels/libs/scattermoe_lora accordingly.

---

Nitpick comments:
In `@tests/e2e/integrations/test_scattermoe_lora_kernels.py`:
- Around line 835-836: The tuple returned by flatten_sort_count_ref assigns
expert_offsets which is never used; rename that element to have an underscore
prefix (e.g., _expert_offsets) wherever the tuple is destructured (the
occurrence assigning sorted_expert_idxs, sorted_scattered_idxs, expert_offsets
and the similar destructuring at the later occurrence around line 879) so the
unused variable is clearly marked as intentional; update both destructuring
sites referencing flatten_sort_count_ref to use _expert_offsets instead of
expert_offsets.
- Line 74: The unpacking currently binds an unused variable `K` from `W.shape`;
change the unpacking in the test (where `E, K, N = W.shape` appears) to use an
underscore-prefixed name like `E, _K, N = W.shape` (or `E, _, N = W.shape`) so
linters and readers know the middle dimension is intentionally unused; update
any references to `K` if they exist (there should be none).
- Line 537: The test unpacks three values from reference_lora_backward into
ref_dX, ref_dA, ref_dB but only uses ref_dX; update the unpacking to avoid
unused variables by either assigning only the first return (e.g., ref_dX =
reference_lora_backward(...)[0]) or using Python throwaway names (e.g., ref_dX,
_, _ = reference_lora_backward(...)) so ref_dA and ref_dB are not left unused;
ensure the call site remains reference_lora_backward(...) and that ref_dX is
used unchanged.
- Around line 1281-1289: The test unpacks padded_ei and padded_si from
lora_ops.round_expert_counts but never uses them; update the unpacking in the
test to ignore those values (e.g., use _ or _padded_ei/_padded_si) when calling
lora_ops.round_expert_counts so only padded_offsets and real_offsets are kept,
and apply the same change to the other similar unpack sites referenced (the
calls around the locations noted, e.g., the other calls to
lora_ops.round_expert_counts at the later test blocks).

In `@tests/e2e/integrations/test_scattermoe_lora_olmoe.py`:
- Around line 494-505: The test unpacks scaling variables gup_s and down_s but
never uses them; change those unpacked names to use an underscore prefix (e.g.,
_gup_s and _down_s) to indicate intentional unused variables in the assertions
around gup_A, gup_B, down_A, down_B; update the unpacking lines where gup_lora
and down_lora are destructured so the unused values are named _gup_s and _down_s
respectively (leaving the rest of the assertions unchanged).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e8b17951-1cf1-4187-9264-86aa82667a00

📥 Commits

Reviewing files that changed from the base of the PR and between 80f7088 and 0acf21f.

📒 Files selected for processing (2)

tests/e2e/integrations/test_scattermoe_lora_kernels.py
tests/e2e/integrations/test_scattermoe_lora_olmoe.py

coderabbitai · 2026-03-07T04:20:50Z

+    @staticmethod
+    def _get_repo_path():
+        """Get the path to scattermoe_lora within axolotl's plugin."""
+        return (
+            Path(__file__).parent.parent.parent
+            / "src"
+            / "axolotl"
+            / "integrations"
+            / "kernels"
+            / "libs"
+            / "scattermoe_lora"
+        )


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Verify the path structure and check if the calculated path is correct # Find the test file location fd -t f "test_scattermoe_lora_olmoe.py" --exec dirname {} # Check what exists at the calculated path (3 levels up) fd -t f "test_scattermoe_lora_olmoe.py" --exec sh -c 'dirname "$1" | xargs -I{} sh -c "cd {} && cd ../../.. && pwd && ls -la | head -5"' _ {} # Find the actual scattermoe_lora directory fd -t d "scattermoe_lora" | head -5

Repository: axolotl-ai-cloud/axolotl

Length of output: 416

Path calculation error: _get_repo_path() navigates to non-existent tests/src/ directory.

From tests/e2e/integrations/test_scattermoe_lora_olmoe.py, parent.parent.parent reaches tests/, then appending "src" creates tests/src/axolotl/... which does not exist. The actual scattermoe_lora directory is at the repository root: src/axolotl/integrations/kernels/libs/scattermoe_lora/.

To reach the repository root, use parent.parent.parent.parent:

Fix

`@staticmethod` def _get_repo_path(): """Get the path to scattermoe_lora within axolotl's plugin.""" return ( - Path(__file__).parent.parent.parent + Path(__file__).parent.parent.parent.parent / "src" / "axolotl" / "integrations" / "kernels" / "libs" / "scattermoe_lora" )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/e2e/integrations/test_scattermoe_lora_olmoe.py` around lines 932 - 943, The _get_repo_path() helper builds an incorrect path by stopping at parent.parent.parent (which lands in tests/) then appending "src", causing tests/src/... to be sought; update the path traversal in _get_repo_path() to ascend one more level (use parent.parent.parent.parent) so the constructed path points to the repository root and resolves to src/axolotl/integrations/kernels/libs/scattermoe_lora accordingly.

coderabbitai · 2026-03-07T04:20:50Z

+        repo_path = (
+            Path(__file__).parent.parent.parent
+            / "src"
+            / "axolotl"
+            / "integrations"
+            / "kernels"
+            / "libs"
+            / "scattermoe_lora"
+        )


🛠️ Refactor suggestion | 🟠 Major

Duplicated potentially incorrect path calculation.

This path calculation has the same issue as _get_repo_path() - it navigates to tests/src/axolotl/... instead of <repo_root>/src/axolotl/.... Consider extracting this to a shared helper to avoid duplication and ensure consistency.

♻️ Suggested refactor to use the shared helper

# Kernelize - repo_path = ( - Path(__file__).parent.parent.parent - / "src" - / "axolotl" - / "integrations" - / "kernels" - / "libs" - / "scattermoe_lora" - ) + repo_path = TestKernelizeIntegration._get_repo_path() local_repo = LocalLayerRepository(

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/e2e/integrations/test_scattermoe_lora_olmoe.py` around lines 1221 - 1229, The repo_path calculation in test_scattermoe_lora_olmoe.py duplicates the incorrect relative navigation used by _get_repo_path() (it builds tests/src/axolotl/... instead of <repo_root>/src/axolotl/...), so replace the inline Path(...) construction with a call to the shared helper (e.g. _get_repo_path or a newly exported get_repo_root helper) and update tests to use that helper; ensure the helper returns the repository root (not relative to the tests folder) and then append /"src"/"axolotl"/"integrations"/"kernels"/"libs"/"scattermoe_lora" to that root to compute repo_path.

codecov · 2026-03-07T04:29:13Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

add gpu tests for scattermoe

0acf21f

coderabbitai Bot reviewed Mar 7, 2026

View reviewed changes

winglian merged commit a36aaa7 into main Mar 7, 2026
13 of 17 checks passed

winglian deleted the moe-lora-tests branch March 7, 2026 05:00

coderabbitai Bot mentioned this pull request Mar 7, 2026

consolidate behavioud of routing in scattermoe kernels #3475

Merged

coderabbitai Bot mentioned this pull request Mar 30, 2026

feat: add moe kernel support for non-glu #3558

Open

coderabbitai Bot mentioned this pull request May 22, 2026

fix: refactor kernels patch to drop routing and inject into Expert #3651

Merged

coderabbitai Bot mentioned this pull request Jun 6, 2026

perf(scattermoe-lora): grouped-Gram dA/dB + sync-free dX_lora for large-E MoEs #3712

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add gpu correctness tests for scattermoe-lora#3474

add gpu correctness tests for scattermoe-lora#3474
winglian merged 1 commit into
mainfrom
moe-lora-tests

winglian commented Mar 7, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 7, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Mar 7, 2026

Uh oh!

coderabbitai Bot Mar 7, 2026

Uh oh!

codecov Bot commented Mar 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

winglian commented Mar 7, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

How has this been tested?

AI Usage Disclaimer

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Mar 7, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

winglian commented Mar 7, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 7, 2026 •

edited

Loading