revert nemo-rl patch by activatedgeek · Pull Request #871 · NVIDIA-NeMo/Skills

activatedgeek · 2025-09-30T21:21:42Z

Summary by CodeRabbit

New Features
- Added support for configuring LayerNorm epsilon via settings to better align model behavior with desired precision/stability.
Chores
- Increased data loading parallelism to improve training throughput.
- Adjusted batch handling to drop incomplete batches for more consistent training steps.

Signed-off-by: Sanyam Kapoor <sanyamk@nvidia.com>

coderabbitai · 2025-09-30T21:21:50Z

Walkthrough

Adds num_workers=10 to DataLoader calls in sft.py and toggles one DataLoader’s drop_last to True. In megatron_policy_worker.py, conditionally applies layernorm_epsilon from config to model configuration if provided.

Changes

Cohort / File(s)	Summary of Changes
Data loading configuration updates `dockerfiles/nemo-rl.patch.../sft.py`	Set DataLoader num_workers=10 in both invocations; in the second invocation, drop_last changed from False to True.
Model configuration parameter application `dockerfiles/nemo-rl.patch.../megatron_policy_worker.py`	After constructing model_cfg, apply layernorm_epsilon from megatron_cfg when present.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Trainer as Trainer (sft.py)
  participant Data as Dataset
  participant DL as DataLoader

  Trainer->>Data: Prepare dataset
  Trainer->>DL: Create DataLoader(dataset, num_workers=10, drop_last=[false|true])
  note over DL: drop_last toggled to True in one loader
  DL-->>Trainer: Batches

sequenceDiagram
  autonumber
  participant Worker as PolicyWorker
  participant Cfg as megatron_cfg
  participant Model as model_cfg

  Worker->>Cfg: Load config
  Worker->>Model: Build base model_cfg
  alt layernorm_epsilon present
    Worker->>Model: Set layernorm_epsilon
    note right of Model: Conditional application
  else absent
    Note right of Worker: Use default epsilon
  end

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I twitch my whiskers at threads that spin,
Ten workers churn where batches begin.
Drop the last, we hop ahead—
Epsilon whispers to normalize the spread.
Carrot-shaped configs align just right,
Patch applied—onward, into the night! 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Title Check	⚠️ Warning	The title “revert nemo-rl patch” does not reflect the actual changes, which introduce `num_workers` to DataLoader calls and conditionally apply `layernorm_epsilon` in `megatron_policy_worker.py` rather than reverting a patch.	Please update the title to succinctly describe the primary changes, for example “Add num_workers to DataLoader calls and support conditional layernorm_epsilon in megatron_policy_worker.”

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch sanyamk/revert-nemo-rl-patch

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74b8ed8 and 655196e.

📒 Files selected for processing (1)

dockerfiles/nemo-rl.patch (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: unit-tests

🔇 Additional comments (3)

dockerfiles/nemo-rl.patch (3)

9-9: Good call restoring multi-worker loading.

Line 9 reintroduces num_workers=10, which should recover the data pipe throughput lost in the earlier patch. Nice catch.

17-17: Confirm the implications of drop_last=True.

Line 17 now drops the tail batch whenever the resume loader length isn’t an exact multiple of the batch size, which would silently toss a handful of samples during recovery. Can you confirm the dataset cardinality guarantee (or otherwise justify the data loss) so we don’t regress evaluation coverage when resuming?

29-30: Optional layernorm epsilon wiring looks solid.

Line 30 cleanly pipes the optional layernorm_epsilon through when present, letting us tune stability without touching defaults.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Sanyam Kapoor <sanyamk@nvidia.com>

Signed-off-by: Sanyam Kapoor <sanyamk@nvidia.com> Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: Sanyam Kapoor <sanyamk@nvidia.com> Signed-off-by: dgitman <dgitman@nvidia.com>

revert nemo-rl patch

655196e

Signed-off-by: Sanyam Kapoor <sanyamk@nvidia.com>

activatedgeek requested a review from wedu-nvidia September 30, 2025 21:21

activatedgeek requested a review from Kipok September 30, 2025 21:21

activatedgeek marked this pull request as ready for review September 30, 2025 21:23

Kipok approved these changes Sep 30, 2025

View reviewed changes

Kipok merged commit e342ee8 into main Sep 30, 2025
6 checks passed

Kipok deleted the sanyamk/revert-nemo-rl-patch branch September 30, 2025 21:28

wasiahmad pushed a commit that referenced this pull request Oct 1, 2025

revert nemo-rl patch (#871)

cc875fd

Signed-off-by: Sanyam Kapoor <sanyamk@nvidia.com>

SeanNaren pushed a commit to SeanNaren/NeMo-Skills that referenced this pull request Oct 9, 2025

revert nemo-rl patch (NVIDIA-NeMo#871)

e462de4

Signed-off-by: Sanyam Kapoor <sanyamk@nvidia.com> Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

SeanNaren pushed a commit that referenced this pull request Oct 9, 2025

revert nemo-rl patch (#871)

bad0beb

Signed-off-by: Sanyam Kapoor <sanyamk@nvidia.com> Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

dgtm777 pushed a commit that referenced this pull request Mar 18, 2026

revert nemo-rl patch (#871)

619755f

Signed-off-by: Sanyam Kapoor <sanyamk@nvidia.com> Signed-off-by: dgitman <dgitman@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

revert nemo-rl patch#871

revert nemo-rl patch#871
Kipok merged 1 commit intomainfrom
sanyamk/revert-nemo-rl-patch

activatedgeek commented Sep 30, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

activatedgeek commented Sep 30, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

activatedgeek commented Sep 30, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 30, 2025 •

edited

Loading