add gpu correctness tests for scattermoe-lora#3474
Conversation
📝 WalkthroughWalkthroughIntroduces two comprehensive integration test suites for ScatterMoE LoRA fused kernels: one validating forward and backward paths across dimensions, data types, and configurations; another testing ScatterMoE integration with OLMoE models and peft LoRA adapters, including layout conversions and kernelized execution paths. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (5)
tests/e2e/integrations/test_scattermoe_lora_kernels.py (4)
835-836: Minor: Unusedexpert_offsetsin edge case tests.The variable is returned but not used. Consider using underscore prefix.
♻️ Suggested fix
- sorted_expert_idxs, sorted_scattered_idxs, expert_offsets = ( + sorted_expert_idxs, sorted_scattered_idxs, _expert_offsets = ( flatten_sort_count_ref(selected_experts, E) )Apply same change to line 879.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/integrations/test_scattermoe_lora_kernels.py` around lines 835 - 836, The tuple returned by flatten_sort_count_ref assigns expert_offsets which is never used; rename that element to have an underscore prefix (e.g., _expert_offsets) wherever the tuple is destructured (the occurrence assigning sorted_expert_idxs, sorted_scattered_idxs, expert_offsets and the similar destructuring at the later occurrence around line 879) so the unused variable is clearly marked as intentional; update both destructuring sites referencing flatten_sort_count_ref to use _expert_offsets instead of expert_offsets.
74-74: Minor: Unused variableKin unpacking.The variable
Kis unpacked but never used in the function. Consider using underscore prefix.♻️ Suggested fix
- E, K, N = W.shape + E, _K, N = W.shape🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/integrations/test_scattermoe_lora_kernels.py` at line 74, The unpacking currently binds an unused variable `K` from `W.shape`; change the unpacking in the test (where `E, K, N = W.shape` appears) to use an underscore-prefixed name like `E, _K, N = W.shape` (or `E, _, N = W.shape`) so linters and readers know the middle dimension is intentionally unused; update any references to `K` if they exist (there should be none).
537-537: Minor: Unused unpacked variablesref_dAandref_dB.These are returned by
reference_lora_backwardbut not used since this test only verifies input gradients.♻️ Suggested fix
- ref_dX, ref_dA, ref_dB = reference_lora_backward( + ref_dX, _ref_dA, _ref_dB = reference_lora_backward(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/integrations/test_scattermoe_lora_kernels.py` at line 537, The test unpacks three values from reference_lora_backward into ref_dX, ref_dA, ref_dB but only uses ref_dX; update the unpacking to avoid unused variables by either assigning only the first return (e.g., ref_dX = reference_lora_backward(...)[0]) or using Python throwaway names (e.g., ref_dX, _, _ = reference_lora_backward(...)) so ref_dA and ref_dB are not left unused; ensure the call site remains reference_lora_backward(...) and that ref_dX is used unchanged.
1281-1289: Minor: Unused unpacked variables in token rounding tests.
padded_eiandpadded_siare returned but onlypadded_offsetsandreal_offsetsare verified in this test.♻️ Suggested fix
- padded_ei, padded_si, padded_offsets, real_offsets = ( + _padded_ei, _padded_si, padded_offsets, real_offsets = ( lora_ops.round_expert_counts(Apply same pattern to lines 1344 and 1386.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/integrations/test_scattermoe_lora_kernels.py` around lines 1281 - 1289, The test unpacks padded_ei and padded_si from lora_ops.round_expert_counts but never uses them; update the unpacking in the test to ignore those values (e.g., use _ or _padded_ei/_padded_si) when calling lora_ops.round_expert_counts so only padded_offsets and real_offsets are kept, and apply the same change to the other similar unpack sites referenced (the calls around the locations noted, e.g., the other calls to lora_ops.round_expert_counts at the later test blocks).tests/e2e/integrations/test_scattermoe_lora_olmoe.py (1)
494-505: Consider using underscore prefix for intentionally unused unpacked variables.The
gup_sanddown_sscaling values are unpacked but not used in these shape verification tests.♻️ Suggested fix
E, r = config.num_experts, 4 - gup_A, gup_B, gup_s = gup_lora + gup_A, gup_B, _gup_s = gup_lora assert gup_A.shape == (E * r, config.hidden_size), ( ... # down_proj W = param.T = [E, inter, hidden], K=inter, N=hidden - down_A, down_B, down_s = down_lora + down_A, down_B, _down_s = down_lora🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/integrations/test_scattermoe_lora_olmoe.py` around lines 494 - 505, The test unpacks scaling variables gup_s and down_s but never uses them; change those unpacked names to use an underscore prefix (e.g., _gup_s and _down_s) to indicate intentional unused variables in the assertions around gup_A, gup_B, down_A, down_B; update the unpacking lines where gup_lora and down_lora are destructured so the unused values are named _gup_s and _down_s respectively (leaving the rest of the assertions unchanged).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/e2e/integrations/test_scattermoe_lora_olmoe.py`:
- Around line 1221-1229: The repo_path calculation in
test_scattermoe_lora_olmoe.py duplicates the incorrect relative navigation used
by _get_repo_path() (it builds tests/src/axolotl/... instead of
<repo_root>/src/axolotl/...), so replace the inline Path(...) construction with
a call to the shared helper (e.g. _get_repo_path or a newly exported
get_repo_root helper) and update tests to use that helper; ensure the helper
returns the repository root (not relative to the tests folder) and then append
/"src"/"axolotl"/"integrations"/"kernels"/"libs"/"scattermoe_lora" to that root
to compute repo_path.
- Around line 932-943: The _get_repo_path() helper builds an incorrect path by
stopping at parent.parent.parent (which lands in tests/) then appending "src",
causing tests/src/... to be sought; update the path traversal in
_get_repo_path() to ascend one more level (use parent.parent.parent.parent) so
the constructed path points to the repository root and resolves to
src/axolotl/integrations/kernels/libs/scattermoe_lora accordingly.
---
Nitpick comments:
In `@tests/e2e/integrations/test_scattermoe_lora_kernels.py`:
- Around line 835-836: The tuple returned by flatten_sort_count_ref assigns
expert_offsets which is never used; rename that element to have an underscore
prefix (e.g., _expert_offsets) wherever the tuple is destructured (the
occurrence assigning sorted_expert_idxs, sorted_scattered_idxs, expert_offsets
and the similar destructuring at the later occurrence around line 879) so the
unused variable is clearly marked as intentional; update both destructuring
sites referencing flatten_sort_count_ref to use _expert_offsets instead of
expert_offsets.
- Line 74: The unpacking currently binds an unused variable `K` from `W.shape`;
change the unpacking in the test (where `E, K, N = W.shape` appears) to use an
underscore-prefixed name like `E, _K, N = W.shape` (or `E, _, N = W.shape`) so
linters and readers know the middle dimension is intentionally unused; update
any references to `K` if they exist (there should be none).
- Line 537: The test unpacks three values from reference_lora_backward into
ref_dX, ref_dA, ref_dB but only uses ref_dX; update the unpacking to avoid
unused variables by either assigning only the first return (e.g., ref_dX =
reference_lora_backward(...)[0]) or using Python throwaway names (e.g., ref_dX,
_, _ = reference_lora_backward(...)) so ref_dA and ref_dB are not left unused;
ensure the call site remains reference_lora_backward(...) and that ref_dX is
used unchanged.
- Around line 1281-1289: The test unpacks padded_ei and padded_si from
lora_ops.round_expert_counts but never uses them; update the unpacking in the
test to ignore those values (e.g., use _ or _padded_ei/_padded_si) when calling
lora_ops.round_expert_counts so only padded_offsets and real_offsets are kept,
and apply the same change to the other similar unpack sites referenced (the
calls around the locations noted, e.g., the other calls to
lora_ops.round_expert_counts at the later test blocks).
In `@tests/e2e/integrations/test_scattermoe_lora_olmoe.py`:
- Around line 494-505: The test unpacks scaling variables gup_s and down_s but
never uses them; change those unpacked names to use an underscore prefix (e.g.,
_gup_s and _down_s) to indicate intentional unused variables in the assertions
around gup_A, gup_B, down_A, down_B; update the unpacking lines where gup_lora
and down_lora are destructured so the unused values are named _gup_s and _down_s
respectively (leaving the rest of the assertions unchanged).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e8b17951-1cf1-4187-9264-86aa82667a00
📒 Files selected for processing (2)
tests/e2e/integrations/test_scattermoe_lora_kernels.pytests/e2e/integrations/test_scattermoe_lora_olmoe.py
| @staticmethod | ||
| def _get_repo_path(): | ||
| """Get the path to scattermoe_lora within axolotl's plugin.""" | ||
| return ( | ||
| Path(__file__).parent.parent.parent | ||
| / "src" | ||
| / "axolotl" | ||
| / "integrations" | ||
| / "kernels" | ||
| / "libs" | ||
| / "scattermoe_lora" | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Verify the path structure and check if the calculated path is correct
# Find the test file location
fd -t f "test_scattermoe_lora_olmoe.py" --exec dirname {}
# Check what exists at the calculated path (3 levels up)
fd -t f "test_scattermoe_lora_olmoe.py" --exec sh -c 'dirname "$1" | xargs -I{} sh -c "cd {} && cd ../../.. && pwd && ls -la | head -5"' _ {}
# Find the actual scattermoe_lora directory
fd -t d "scattermoe_lora" | head -5Repository: axolotl-ai-cloud/axolotl
Length of output: 416
Path calculation error: _get_repo_path() navigates to non-existent tests/src/ directory.
From tests/e2e/integrations/test_scattermoe_lora_olmoe.py, parent.parent.parent reaches tests/, then appending "src" creates tests/src/axolotl/... which does not exist. The actual scattermoe_lora directory is at the repository root: src/axolotl/integrations/kernels/libs/scattermoe_lora/.
To reach the repository root, use parent.parent.parent.parent:
Fix
`@staticmethod`
def _get_repo_path():
"""Get the path to scattermoe_lora within axolotl's plugin."""
return (
- Path(__file__).parent.parent.parent
+ Path(__file__).parent.parent.parent.parent
/ "src"
/ "axolotl"
/ "integrations"
/ "kernels"
/ "libs"
/ "scattermoe_lora"
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/e2e/integrations/test_scattermoe_lora_olmoe.py` around lines 932 - 943,
The _get_repo_path() helper builds an incorrect path by stopping at
parent.parent.parent (which lands in tests/) then appending "src", causing
tests/src/... to be sought; update the path traversal in _get_repo_path() to
ascend one more level (use parent.parent.parent.parent) so the constructed path
points to the repository root and resolves to
src/axolotl/integrations/kernels/libs/scattermoe_lora accordingly.
| repo_path = ( | ||
| Path(__file__).parent.parent.parent | ||
| / "src" | ||
| / "axolotl" | ||
| / "integrations" | ||
| / "kernels" | ||
| / "libs" | ||
| / "scattermoe_lora" | ||
| ) |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Duplicated potentially incorrect path calculation.
This path calculation has the same issue as _get_repo_path() - it navigates to tests/src/axolotl/... instead of <repo_root>/src/axolotl/.... Consider extracting this to a shared helper to avoid duplication and ensure consistency.
♻️ Suggested refactor to use the shared helper
# Kernelize
- repo_path = (
- Path(__file__).parent.parent.parent
- / "src"
- / "axolotl"
- / "integrations"
- / "kernels"
- / "libs"
- / "scattermoe_lora"
- )
+ repo_path = TestKernelizeIntegration._get_repo_path()
local_repo = LocalLayerRepository(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/e2e/integrations/test_scattermoe_lora_olmoe.py` around lines 1221 -
1229, The repo_path calculation in test_scattermoe_lora_olmoe.py duplicates the
incorrect relative navigation used by _get_repo_path() (it builds
tests/src/axolotl/... instead of <repo_root>/src/axolotl/...), so replace the
inline Path(...) construction with a call to the shared helper (e.g.
_get_repo_path or a newly exported get_repo_root helper) and update tests to use
that helper; ensure the helper returns the repository root (not relative to the
tests folder) and then append
/"src"/"axolotl"/"integrations"/"kernels"/"libs"/"scattermoe_lora" to that root
to compute repo_path.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Description
Motivation and Context
How has this been tested?
AI Usage Disclaimer
Screenshots (if appropriate)
Types of changes
Social Handles (Optional)
Summary by CodeRabbit