Conversation
Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com>
There was a problem hiding this comment.
Could you add this key to all the configs/recipes?
terrykong
left a comment
There was a problem hiding this comment.
Is this possible to unit test?
|
@jiemingz is the only thing blocking this PR the seq-packing change since we need static shapes for torch.compile? |
|
Dependent on #300 |
|
Dtensor sequence packing has been merged. @ahmadki to support max-padding packed sequences in DTensor to enable torch.compile (fixed seqlen). |
| ) | ||
|
|
||
| if self.torch_compile: | ||
| self.model = torch.compile(model) |
There was a problem hiding this comment.
Could you try model.compile() instead? That should fix the _orig_mod issue. This is also the recommended way of compiling a model now. We'll work on throwing warnings and publicizing to raise awareness on this.
|
@yuki-97 can you please review this and take this forward. |
|
@euronymous-aithal Hi Ashwath, automodel indeed has torch.compile support already. But we have never enabled within nemo RL. I think it is proper to make it fixed with FP8 dtensor is ready @RayenTian is working on it, since FP8 dtensor train requires to enable torch.compile. We still have encounter torch.compile bug when TP>1. Still investigation. |
|
Closing since this is w.r.t. v1 and we are currently centralizing efforts around v2 . closing this and reassigned #4 to @joyang-nv for now |
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Issues
List issues that this PR closes (syntax):
Usage
# Add a code snippet demonstrating how to use thisBefore your PR is "Ready for review"
Pre checks:
Additional Information