Skip to content

revert nemo-rl patch#871

Merged
Kipok merged 1 commit intomainfrom
sanyamk/revert-nemo-rl-patch
Sep 30, 2025
Merged

revert nemo-rl patch#871
Kipok merged 1 commit intomainfrom
sanyamk/revert-nemo-rl-patch

Conversation

@activatedgeek
Copy link
Collaborator

@activatedgeek activatedgeek commented Sep 30, 2025

Summary by CodeRabbit

  • New Features

    • Added support for configuring LayerNorm epsilon via settings to better align model behavior with desired precision/stability.
  • Chores

    • Increased data loading parallelism to improve training throughput.
    • Adjusted batch handling to drop incomplete batches for more consistent training steps.

Signed-off-by: Sanyam Kapoor <sanyamk@nvidia.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 30, 2025

Walkthrough

Adds num_workers=10 to DataLoader calls in sft.py and toggles one DataLoader’s drop_last to True. In megatron_policy_worker.py, conditionally applies layernorm_epsilon from config to model configuration if provided.

Changes

Cohort / File(s) Summary of Changes
Data loading configuration updates
dockerfiles/nemo-rl.patch.../sft.py
Set DataLoader num_workers=10 in both invocations; in the second invocation, drop_last changed from False to True.
Model configuration parameter application
dockerfiles/nemo-rl.patch.../megatron_policy_worker.py
After constructing model_cfg, apply layernorm_epsilon from megatron_cfg when present.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Trainer as Trainer (sft.py)
  participant Data as Dataset
  participant DL as DataLoader

  Trainer->>Data: Prepare dataset
  Trainer->>DL: Create DataLoader(dataset, num_workers=10, drop_last=[false|true])
  note over DL: drop_last toggled to True in one loader
  DL-->>Trainer: Batches
Loading
sequenceDiagram
  autonumber
  participant Worker as PolicyWorker
  participant Cfg as megatron_cfg
  participant Model as model_cfg

  Worker->>Cfg: Load config
  Worker->>Model: Build base model_cfg
  alt layernorm_epsilon present
    Worker->>Model: Set layernorm_epsilon
    note right of Model: Conditional application
  else absent
    Note right of Worker: Use default epsilon
  end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I twitch my whiskers at threads that spin,
Ten workers churn where batches begin.
Drop the last, we hop ahead—
Epsilon whispers to normalize the spread.
Carrot-shaped configs align just right,
Patch applied—onward, into the night! 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The title “revert nemo-rl patch” does not reflect the actual changes, which introduce num_workers to DataLoader calls and conditionally apply layernorm_epsilon in megatron_policy_worker.py rather than reverting a patch. Please update the title to succinctly describe the primary changes, for example “Add num_workers to DataLoader calls and support conditional layernorm_epsilon in megatron_policy_worker.”
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sanyamk/revert-nemo-rl-patch

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74b8ed8 and 655196e.

📒 Files selected for processing (1)
  • dockerfiles/nemo-rl.patch (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: unit-tests
🔇 Additional comments (3)
dockerfiles/nemo-rl.patch (3)

9-9: Good call restoring multi-worker loading.

Line 9 reintroduces num_workers=10, which should recover the data pipe throughput lost in the earlier patch. Nice catch.


17-17: Confirm the implications of drop_last=True.

Line 17 now drops the tail batch whenever the resume loader length isn’t an exact multiple of the batch size, which would silently toss a handful of samples during recovery. Can you confirm the dataset cardinality guarantee (or otherwise justify the data loss) so we don’t regress evaluation coverage when resuming?


29-30: Optional layernorm epsilon wiring looks solid.

Line 30 cleanly pipes the optional layernorm_epsilon through when present, letting us tune stability without touching defaults.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@activatedgeek activatedgeek requested a review from Kipok September 30, 2025 21:21
@activatedgeek activatedgeek marked this pull request as ready for review September 30, 2025 21:23
@Kipok Kipok merged commit e342ee8 into main Sep 30, 2025
6 checks passed
@Kipok Kipok deleted the sanyamk/revert-nemo-rl-patch branch September 30, 2025 21:28
wasiahmad pushed a commit that referenced this pull request Oct 1, 2025
Signed-off-by: Sanyam Kapoor <sanyamk@nvidia.com>
SeanNaren pushed a commit to SeanNaren/NeMo-Skills that referenced this pull request Oct 9, 2025
Signed-off-by: Sanyam Kapoor <sanyamk@nvidia.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
SeanNaren pushed a commit that referenced this pull request Oct 9, 2025
Signed-off-by: Sanyam Kapoor <sanyamk@nvidia.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
dgtm777 pushed a commit that referenced this pull request Mar 18, 2026
Signed-off-by: Sanyam Kapoor <sanyamk@nvidia.com>
Signed-off-by: dgitman <dgitman@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants