tests: follow MoE merge wrapper delegation in drift detector#653
Conversation
After zoo#647 the gate/up MoE merge entry points became thin wrappers
delegating to _merge_moe_gate_or_up_expert. The drift detector test
only inspected the wrapper source and so reported a false-positive
regression (no .to('cuda') call, no _active_merge_device call inside
the wrapper body itself).
Update test_moe_expert_merges_call_active_merge_device to follow one
level of delegation through sibling _merge_moe_* helpers via a small
regex-walked source-union helper, so the union still requires
_active_merge_device() and forbids hardcoded .to('cuda').
Local: pytest -k test_moe_expert_merges_call_active_merge_device
passes (1 passed in 0.26s).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 87138e24fb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ) | ||
| src = inspect.getsource(fn) | ||
| src = _effective_source(name, set()) | ||
| assert "_active_merge_device(" in src, ( |
There was a problem hiding this comment.
Keep the active-device check on the entry point
Because _effective_source() concatenates every sibling helper before this assertion, any target can now satisfy the check via a callee even if the entry point itself starts doing its own device move with DEVICE_TYPE_TORCH or another non-helper device before delegating. That reopens the exact drift the test documents for non-thin helpers or wrappers that grow logic: the union still contains _active_merge_device( from _merge_moe_gate_or_up_expert, so the regression is missed unless it is a literal .to('cuda').
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review
This pull request updates the MoE expert merge tests to support code delegation by introducing a recursive source code inspection utility. This ensures that both entry points and their delegated helpers are correctly validated for device-agnostic implementations. A review comment suggested improving the robustness of the regular expression used to identify helper calls by allowing for optional whitespace.
| "_merge_moe_fused_gate_up_expert", | ||
| "_merge_moe_fused_down_proj_expert", | ||
| ] | ||
| _helper_call_re = re.compile(r"\b(_merge_moe_[A-Za-z0-9_]+)\(") |
There was a problem hiding this comment.
The regex for detecting helper calls is slightly fragile as it doesn't account for optional whitespace between the function name and the opening parenthesis. While the current codebase follows a strict style, making the regex more robust ensures the test doesn't fail on valid Python syntax variations (e.g., _merge_moe_helper (...)). This minor improvement is preferred as it increases robustness without requiring a large-scale refactor.
| _helper_call_re = re.compile(r"\b(_merge_moe_[A-Za-z0-9_]+)\(") | |
| _helper_call_re = re.compile(r"\b(_merge_moe_[A-Za-z0-9_]+)\s*\(") |
References
- The codebase follows a strict style regarding whitespace in function calls. (link)
- Fragile string-matching is acceptable for code patching only if it maintains consistency and avoids large-scale refactors.
Summary
_merge_moe_gate_or_up_expert; the drift detector intest_moe_expert_merges_call_active_merge_deviceonly inspected the wrapper source and so reported a false-positive regression._merge_moe_*helpers via a small regex-driven source-union helper, so the assertion still requires_active_merge_device()and forbids hardcoded.to('cuda', ...)across the dispatch surface that actually does the merge.Test plan
pytest tests/test_upstream_pinned_symbols_accelerator.py::test_moe_expert_merges_call_active_merge_device -xvs