Draft: Feat: Enable torch compile by jiemingz · Pull Request #496 · NVIDIA-NeMo/RL

jiemingz · 2025-06-10T14:01:18Z

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Issues

List issues that this PR closes (syntax):

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

...

Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com>

terrykong · 2025-06-10T15:52:17Z

examples/configs/dpo.yaml

Could you add this key to all the configs/recipes?

terrykong

Is this possible to unit test?

terrykong · 2025-07-09T00:29:59Z

@jiemingz is the only thing blocking this PR the seq-packing change since we need static shapes for torch.compile?

terrykong · 2025-07-09T03:57:12Z

Dependent on #300

SahilJain314 · 2025-07-23T20:43:33Z

Dtensor sequence packing has been merged. @ahmadki to support max-padding packed sequences in DTensor to enable torch.compile (fixed seqlen).

ahmadki · 2025-07-24T15:09:58Z

Dtensor sequence packing has been merged. @ahmadki to support max-padding packed sequences in DTensor to enable torch.compile (fixed seqlen).

tracking here

StrongerXi · 2025-07-29T18:48:54Z

nemo_rl/models/policy/dtensor_policy_worker.py

        )

+        if self.torch_compile:
+            self.model = torch.compile(model)


Could you try model.compile() instead? That should fix the _orig_mod issue. This is also the recommended way of compiling a model now. We'll work on throwing warnings and publicizing to raise awareness on this.

euronymous-aithal · 2025-09-17T04:26:47Z

@yuki-97 can you please review this and take this forward.
(QQ: @ffrujeri and @joyang-nv not sure if Automodel path already enables torch.compile if so we should close this )

joyang-nv · 2025-09-17T14:37:29Z

@euronymous-aithal Hi Ashwath, automodel indeed has torch.compile support already. But we have never enabled within nemo RL. I think it is proper to make it fixed with FP8 dtensor is ready @RayenTian is working on it, since FP8 dtensor train requires to enable torch.compile. We still have encounter torch.compile bug when TP>1. Still investigation.

terrykong · 2025-10-01T20:39:34Z

Closing since this is w.r.t. v1 and we are currently centralizing efforts around v2 . closing this and reassigned #4 to @joyang-nv for now

torch compile

9024892

Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com>

jiemingz requested a review from terrykong June 10, 2025 14:01

jiemingz self-assigned this Jun 10, 2025

terrykong reviewed Jun 10, 2025

View reviewed changes

examples/configs/dpo.yaml

Copy link

Collaborator

terrykong Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add this key to all the configs/recipes?

terrykong reviewed Jun 10, 2025

View reviewed changes

StrongerXi reviewed Jul 29, 2025

View reviewed changes

terrykong linked an issue Aug 7, 2025 that may be closed by this pull request

torch.compile for training #4

Open

terrykong mentioned this pull request Aug 27, 2025

draft: feat: fused loss and logit to logprob conversion #994

Open

4 tasks

terrykong closed this Oct 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: Feat: Enable torch compile#496

Draft: Feat: Enable torch compile#496
jiemingz wants to merge 1 commit intomainfrom
jiemingz/torch_compile

jiemingz commented Jun 10, 2025

Uh oh!

terrykong Jun 10, 2025

Uh oh!

terrykong left a comment

Uh oh!

terrykong commented Jul 9, 2025

Uh oh!

terrykong commented Jul 9, 2025

Uh oh!

SahilJain314 commented Jul 23, 2025

Uh oh!

ahmadki commented Jul 24, 2025

Uh oh!

StrongerXi Jul 29, 2025

Uh oh!

euronymous-aithal commented Sep 17, 2025

Uh oh!

joyang-nv commented Sep 17, 2025

Uh oh!

terrykong commented Oct 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

jiemingz commented Jun 10, 2025

What does this PR do ?

Issues

Usage

Before your PR is "Ready for review"

Additional Information

Uh oh!

terrykong Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

terrykong left a comment

Choose a reason for hiding this comment

Uh oh!

terrykong commented Jul 9, 2025

Uh oh!

terrykong commented Jul 9, 2025

Uh oh!

SahilJain314 commented Jul 23, 2025

Uh oh!

ahmadki commented Jul 24, 2025

Uh oh!

StrongerXi Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

euronymous-aithal commented Sep 17, 2025

Uh oh!

joyang-nv commented Sep 17, 2025

Uh oh!

terrykong commented Oct 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants