feat: Support lora in dtensor grpo workflow[2/3]: sync and non-colocated setup#1751
feat: Support lora in dtensor grpo workflow[2/3]: sync and non-colocated setup#1751RayenTian wants to merge 11 commits intoruit/lora_grpo_sync_colocatedfrom
Conversation
ℹ️ File Consistency CheckCheck based on commit: 3f5b2b5 (PR #1751 from ✅ DTensor Policy Worker Synchronization CheckBoth DTensor policy worker files were modified in this PR:
Please ensure that the changes are consistent between both files where applicable. This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning. |
3f5b2b5 to
ee34dcb
Compare
ℹ️ File Consistency CheckCheck based on commit: ee34dcb (PR #1751 from ✅ DTensor Policy Worker Synchronization CheckBoth DTensor policy worker files were modified in this PR:
Please ensure that the changes are consistent between both files where applicable. This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning. |
ℹ️ File Consistency CheckCheck based on commit: 9693a4e (PR #1751 from ✅ DTensor Policy Worker Synchronization CheckBoth DTensor policy worker files were modified in this PR:
Please ensure that the changes are consistent between both files where applicable. This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning. |
a5f9691 to
e06009f
Compare
9693a4e to
c12e293
Compare
ℹ️ File Consistency CheckCheck based on commit: c12e293 (PR #1751 from ✅ DTensor Policy Worker Synchronization CheckBoth DTensor policy worker files were modified in this PR:
Please ensure that the changes are consistent between both files where applicable. This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning. |
c12e293 to
e37c9d9
Compare
ℹ️ File Consistency CheckCheck based on commit: e37c9d9 (PR #1751 from ✅ DTensor Policy Worker Synchronization CheckBoth DTensor policy worker files were modified in this PR:
Please ensure that the changes are consistent between both files where applicable. This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning. |
e06009f to
0382317
Compare
e37c9d9 to
ee92a4d
Compare
ℹ️ File Consistency CheckCheck based on commit: ee92a4d (PR #1751 from ✅ DTensor Policy Worker Synchronization CheckBoth DTensor policy worker files were modified in this PR:
Please ensure that the changes are consistent between both files where applicable. This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning. |
0382317 to
b8a8c5b
Compare
ee92a4d to
9a1b189
Compare
ℹ️ File Consistency CheckCheck based on commit: 9a1b189 (PR #1751 from ✅ DTensor Policy Worker Synchronization CheckBoth DTensor policy worker files were modified in this PR:
Please ensure that the changes are consistent between both files where applicable. This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning. |
9a1b189 to
2cb73ba
Compare
0bf11eb to
2436d92
Compare
ℹ️ File Consistency CheckCheck based on commit: 2436d92 (PR #1751 from ✅ DTensor Policy Worker Synchronization CheckBoth DTensor policy worker files were modified in this PR:
Please ensure that the changes are consistent between both files where applicable. This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning. |
Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: ruit <ruit@nvidia.com>
…de' argument Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: ruit <ruit@nvidia.com>
48ac5b8 to
6d75d86
Compare
Signed-off-by: ruit <ruit@nvidia.com>
2436d92 to
cfb4f10
Compare
ℹ️ File Consistency CheckCheck based on commit: cfb4f10 (PR #1751 from ✅ DTensor Policy Worker Synchronization CheckBoth DTensor policy worker files were modified in this PR:
Please ensure that the changes are consistent between both files where applicable. This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning. |
…ary parameters Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: ruit <ruit@nvidia.com>
…mode' across multiple interfaces Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: ruit <ruit@nvidia.com>
cfb4f10 to
b880394
Compare
ℹ️ File Consistency CheckCheck based on commit: b880394 (PR #1751 from ✅ DTensor Policy Worker Synchronization CheckBoth DTensor policy worker files were modified in this PR:
Please ensure that the changes are consistent between both files where applicable. This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning. |
ec85ee9 to
65a8e24
Compare
What does this PR do ?
This PR extends LoRA support in DTensor GRPO to work with non-colocated inference by enabling LoRA weight synchronization through the NCCL collective path. It removes the previous restriction that required colocated inference when using LoRA with DTensor backend.
Issues
[2/3] of #1597
Subsequent PRs
#1752
Usage
Result
Non Co-located + Sync
Qwen/Qwen3-0.6B
Llama-3.2-3B-Instruct
Llama-3.1-8B
Before your PR is "Ready for review"
Pre checks:
Additional Information