feat: Support lora in dtensor grpo workflow[3/3]: async vllm#1752
feat: Support lora in dtensor grpo workflow[3/3]: async vllm#1752RayenTian wants to merge 4 commits intoruit/lora_grpo_sync_non_colocatedfrom
Conversation
3f5b2b5 to
ee34dcb
Compare
9bc5186 to
f92d968
Compare
e37c9d9 to
ee92a4d
Compare
62eaaaf to
ab6f639
Compare
|
ab6f639 to
b600e5f
Compare
|
9a1b189 to
2cb73ba
Compare
b600e5f to
3aba604
Compare
|
3aba604 to
8e1312c
Compare
517ab01 to
0bf11eb
Compare
8e1312c to
32d76b9
Compare
0bf11eb to
2436d92
Compare
32d76b9 to
3fdb505
Compare
2436d92 to
cfb4f10
Compare
cfb4f10 to
b880394
Compare
Signed-off-by: ruit <ruit@nvidia.com>
…mode' across multiple interfaces Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: ruit <ruit@nvidia.com>
3fdb505 to
cb7c69b
Compare
Signed-off-by: ruit <ruit@nvidia.com>
What does this PR do ?
Support async config for dtensor lora grpo.
TODOs
Issues
[3/3] of #1597
closes #1597
Usage
# Add a code snippet demonstrating how to use thisResult
Async
Qwen/Qwen3-0.6B
Llama-3.2-3B-Instruct
Llama-3.1-8B
Before your PR is "Ready for review"
Pre checks:
Additional Information