feat: Support lora in dtensor grpo workflow[2/3]: sync and non-colocated setup by RayenTian · Pull Request #1751 · NVIDIA-NeMo/RL

RayenTian · 2026-01-09T08:16:34Z

What does this PR do ?

This PR extends LoRA support in DTensor GRPO to work with non-colocated inference by enabling LoRA weight synchronization through the NCCL collective path. It removes the previous restriction that required colocated inference when using LoRA with DTensor backend.

Base Branch: ruit/lora_grpo_sync_colocated
Source Branch: ruit/lora_grpo_sync_non_colocated

Issues

[2/3] of #1597

Subsequent PRs

#1752

Usage

You can potentially add a usage example below

bash tests/functional/grpo_automodel_lora_non_colocated.sh

Result

Non Co-located + Sync

Qwen/Qwen3-0.6B

Llama-3.2-3B-Instruct

Llama-3.1-8B

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

...

github-actions · 2026-01-09T08:17:08Z

ℹ️ File Consistency Check

Check based on commit: 3f5b2b5 (PR #1751 from ruit/lora_grpo_sync_non_colocated)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

nemo_rl/models/policy/workers/dtensor_policy_worker.py
nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2026-01-09T09:09:28Z

ℹ️ File Consistency Check

Check based on commit: ee34dcb (PR #1751 from ruit/lora_grpo_sync_non_colocated)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

nemo_rl/models/policy/workers/dtensor_policy_worker.py
nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2026-01-09T09:17:57Z

ℹ️ File Consistency Check

Check based on commit: 9693a4e (PR #1751 from ruit/lora_grpo_sync_non_colocated)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

nemo_rl/models/policy/workers/dtensor_policy_worker.py
nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2026-01-13T09:02:13Z

ℹ️ File Consistency Check

Check based on commit: c12e293 (PR #1751 from ruit/lora_grpo_sync_non_colocated)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

nemo_rl/models/policy/workers/dtensor_policy_worker.py
nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2026-01-13T09:16:33Z

ℹ️ File Consistency Check

Check based on commit: e37c9d9 (PR #1751 from ruit/lora_grpo_sync_non_colocated)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

nemo_rl/models/policy/workers/dtensor_policy_worker.py
nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2026-01-13T09:23:35Z

ℹ️ File Consistency Check

Check based on commit: ee92a4d (PR #1751 from ruit/lora_grpo_sync_non_colocated)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

nemo_rl/models/policy/workers/dtensor_policy_worker.py
nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2026-01-13T10:16:54Z

ℹ️ File Consistency Check

Check based on commit: 9a1b189 (PR #1751 from ruit/lora_grpo_sync_non_colocated)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

nemo_rl/models/policy/workers/dtensor_policy_worker.py
nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2026-01-14T09:55:24Z

ℹ️ File Consistency Check

Check based on commit: 2436d92 (PR #1751 from ruit/lora_grpo_sync_non_colocated)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

nemo_rl/models/policy/workers/dtensor_policy_worker.py
nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

Signed-off-by: ruit <ruit@nvidia.com>

…de' argument Signed-off-by: ruit <ruit@nvidia.com>

Signed-off-by: ruit <ruit@nvidia.com>

github-actions · 2026-01-15T08:37:58Z

ℹ️ File Consistency Check

Check based on commit: cfb4f10 (PR #1751 from ruit/lora_grpo_sync_non_colocated)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

nemo_rl/models/policy/workers/dtensor_policy_worker.py
nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

…ary parameters Signed-off-by: ruit <ruit@nvidia.com>

Signed-off-by: ruit <ruit@nvidia.com>

…mode' across multiple interfaces Signed-off-by: ruit <ruit@nvidia.com>

Signed-off-by: ruit <ruit@nvidia.com>

github-actions · 2026-01-15T09:25:42Z

ℹ️ File Consistency Check

Check based on commit: b880394 (PR #1751 from ruit/lora_grpo_sync_non_colocated)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

nemo_rl/models/policy/workers/dtensor_policy_worker.py
nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

RayenTian mentioned this pull request Jan 9, 2026

feat: Support lora for grpo workflow #1702

Closed

4 tasks

RayenTian force-pushed the ruit/lora_grpo_sync_non_colocated branch from 3f5b2b5 to ee34dcb Compare January 9, 2026 09:08

RayenTian added the CI:L1 Run doctests, unit tests, and functional tests label Jan 9, 2026

RayenTian requested a review from terrykong January 9, 2026 09:33

RayenTian temporarily deployed to nemo-ci January 9, 2026 09:33 — with GitHub Actions Inactive

RayenTian requested review from joyang-nv and yuki-97 January 9, 2026 09:33

RayenTian marked this pull request as ready for review January 9, 2026 09:42

RayenTian requested review from a team as code owners January 9, 2026 09:42

RayenTian temporarily deployed to nemo-ci January 9, 2026 10:36 — with GitHub Actions Inactive

RayenTian force-pushed the ruit/lora_grpo_sync_colocated branch from a5f9691 to e06009f Compare January 13, 2026 08:45

RayenTian force-pushed the ruit/lora_grpo_sync_non_colocated branch from 9693a4e to c12e293 Compare January 13, 2026 09:01

RayenTian force-pushed the ruit/lora_grpo_sync_non_colocated branch from c12e293 to e37c9d9 Compare January 13, 2026 09:16

RayenTian force-pushed the ruit/lora_grpo_sync_colocated branch from e06009f to 0382317 Compare January 13, 2026 09:20

RayenTian force-pushed the ruit/lora_grpo_sync_non_colocated branch from e37c9d9 to ee92a4d Compare January 13, 2026 09:23

RayenTian force-pushed the ruit/lora_grpo_sync_colocated branch from 0382317 to b8a8c5b Compare January 13, 2026 10:08

RayenTian force-pushed the ruit/lora_grpo_sync_non_colocated branch from ee92a4d to 9a1b189 Compare January 13, 2026 10:16

RayenTian force-pushed the ruit/lora_grpo_sync_non_colocated branch from 9a1b189 to 2cb73ba Compare January 13, 2026 11:49

RayenTian added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Jan 14, 2026

RayenTian temporarily deployed to nemo-ci January 14, 2026 02:41 — with GitHub Actions Inactive

RayenTian temporarily deployed to nemo-ci January 14, 2026 02:44 — with GitHub Actions Inactive

RayenTian force-pushed the ruit/lora_grpo_sync_non_colocated branch from 0bf11eb to 2436d92 Compare January 14, 2026 09:54

RayenTian added 6 commits January 14, 2026 23:11

support lora grpo on sync + co-located config

a07d95e

Signed-off-by: ruit <ruit@nvidia.com>

fix: add refit parameters for megatron interface to keep consistance

91b0b33

Signed-off-by: ruit <ruit@nvidia.com>

skip lm_head when name mapping

a99c3c4

Signed-off-by: ruit <ruit@nvidia.com>

fix functional test

b1f1b5d

Signed-off-by: ruit <ruit@nvidia.com>

refactor: unify weight refitting parameters to use a single 'refit_mo…

c6f6d27

…de' argument Signed-off-by: ruit <ruit@nvidia.com>

refactor refit parameters, remove para control in grpo

6d75d86

Signed-off-by: ruit <ruit@nvidia.com>

RayenTian force-pushed the ruit/lora_grpo_sync_colocated branch from 48ac5b8 to 6d75d86 Compare January 15, 2026 07:11

move lora to vllm

ab6b375

Signed-off-by: ruit <ruit@nvidia.com>

RayenTian force-pushed the ruit/lora_grpo_sync_non_colocated branch from 2436d92 to cfb4f10 Compare January 15, 2026 08:37

RayenTian added 4 commits January 15, 2026 01:06

refactor: simplify refit_policy_generation calls by removing unnecess…

ec85ee9

…ary parameters Signed-off-by: ruit <ruit@nvidia.com>

suport sync non-colocated

99fedb6

Signed-off-by: ruit <ruit@nvidia.com>

refactor: update weight refitting parameters to use a unified 'refit_…

f877f31

…mode' across multiple interfaces Signed-off-by: ruit <ruit@nvidia.com>

update functional test and fix base model weight refit flag

b880394

Signed-off-by: ruit <ruit@nvidia.com>

RayenTian force-pushed the ruit/lora_grpo_sync_non_colocated branch from cfb4f10 to b880394 Compare January 15, 2026 09:25

RayenTian mentioned this pull request Jan 15, 2026

feat: Support lora in dtensor grpo workflow[1/3]: sync and colocated setup #1748

Open

9 tasks

RayenTian added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Jan 15, 2026

RayenTian temporarily deployed to nemo-ci January 15, 2026 09:41 — with GitHub Actions Inactive

RayenTian temporarily deployed to nemo-ci January 15, 2026 11:56 — with GitHub Actions Inactive

RayenTian force-pushed the ruit/lora_grpo_sync_colocated branch from ec85ee9 to 65a8e24 Compare January 19, 2026 03:07

RayenTian mentioned this pull request Jan 28, 2026

feat: Support lora in dtensor grpo workflow by merging weight #1797

Merged

terrykong mentioned this pull request Feb 2, 2026

LoRa DTensor GPRO #1597

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Support lora in dtensor grpo workflow[2/3]: sync and non-colocated setup#1751

feat: Support lora in dtensor grpo workflow[2/3]: sync and non-colocated setup#1751
RayenTian wants to merge 11 commits intoruit/lora_grpo_sync_colocatedfrom
ruit/lora_grpo_sync_non_colocated

RayenTian commented Jan 9, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 9, 2026

Uh oh!

github-actions bot commented Jan 9, 2026

Uh oh!

github-actions bot commented Jan 9, 2026

Uh oh!

github-actions bot commented Jan 13, 2026

Uh oh!

github-actions bot commented Jan 13, 2026

Uh oh!

github-actions bot commented Jan 13, 2026

Uh oh!

github-actions bot commented Jan 13, 2026

Uh oh!

github-actions bot commented Jan 14, 2026

Uh oh!

github-actions bot commented Jan 15, 2026

Uh oh!

github-actions bot commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RayenTian commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Issues

Subsequent PRs

Usage

Result

Non Co-located + Sync

Qwen/Qwen3-0.6B

Llama-3.2-3B-Instruct

Llama-3.1-8B

Before your PR is "Ready for review"

Additional Information

Uh oh!

github-actions bot commented Jan 9, 2026

ℹ️ File Consistency Check

✅ DTensor Policy Worker Synchronization Check

Uh oh!

github-actions bot commented Jan 9, 2026

ℹ️ File Consistency Check

✅ DTensor Policy Worker Synchronization Check

Uh oh!

github-actions bot commented Jan 9, 2026

ℹ️ File Consistency Check

✅ DTensor Policy Worker Synchronization Check

Uh oh!

github-actions bot commented Jan 13, 2026

ℹ️ File Consistency Check

✅ DTensor Policy Worker Synchronization Check

Uh oh!

github-actions bot commented Jan 13, 2026

ℹ️ File Consistency Check

✅ DTensor Policy Worker Synchronization Check

Uh oh!

github-actions bot commented Jan 13, 2026

ℹ️ File Consistency Check

✅ DTensor Policy Worker Synchronization Check

Uh oh!

github-actions bot commented Jan 13, 2026

ℹ️ File Consistency Check

✅ DTensor Policy Worker Synchronization Check

Uh oh!

github-actions bot commented Jan 14, 2026

ℹ️ File Consistency Check

✅ DTensor Policy Worker Synchronization Check

Uh oh!

github-actions bot commented Jan 15, 2026

ℹ️ File Consistency Check

✅ DTensor Policy Worker Synchronization Check

Uh oh!

github-actions bot commented Jan 15, 2026

ℹ️ File Consistency Check

✅ DTensor Policy Worker Synchronization Check

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

RayenTian commented Jan 9, 2026 •

edited

Loading