[sync] FP8 params support for megatron-fsdp (MXFP8/Blockwise)#2135
[sync] FP8 params support for megatron-fsdp (MXFP8/Blockwise)#2135ananthsub merged 3 commits intoNVIDIA-NeMo:mainfrom
Conversation
Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>
Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>
|
/ok to test 3f551eb |
📝 WalkthroughWalkthroughThe changes add validation logic to Megatron FSDP configuration that disables two gradient buffer reuse flags ( Changes
Estimated Code Review Effort🎯 2 (Simple) | ⏱️ ~12 minutes Suggested Reviewers
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Tip 🧪 Unit Test Generation v2 is now available!We have significantly improved our unit test generation capabilities. To enable: Add this to your reviews:
finishing_touches:
unit_tests:
enabled: trueTry it out by using the Have feedback? Share your thoughts on our Discord thread! Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tests/unit_tests/training/test_config.py`:
- Around line 1049-1084: The test function
test_megatron_fsdp_forces_reuse_grad_buf_false currently declares an unused
monkeypatch parameter which triggers ruff ARG002; remove the monkeypatch
parameter from the function signature so the test becomes def
test_megatron_fsdp_forces_reuse_grad_buf_false(self): and ensure there are no
references to monkeypatch elsewhere in that test (no other changes needed since
it isn't used).
What does this PR do ?
Sync argument validation from NVIDIA/Megatron-LM#2239
Changelog
GitHub Actions CI
See the CI sectionin the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.
Before your PR is "Ready for review"
Pre checks:
If you haven't finished some of the above items you can still open "Draft" PR.
Additional Information
Summary by CodeRabbit
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.