Conversation
📝 WalkthroughWalkthroughRefactors config flag access in Policy.init to explicit key checks for Megatron and DTensor. In save_checkpoint, distinguishes DTensor v2 from non-v2 via new use_dtensor/use_dtensor_v2 flags, routing to v2-specific checkpointing only when applicable; otherwise preserves existing non-v2 path and safetensors validation. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Policy
participant Config
participant DTensorV2 as DTensor v2 Checkpointer
participant LegacySave as Non-v2 Save Path
participant SafeTensors as Safetensors Validator
User->>Policy: save_checkpoint()
Policy->>Config: Read dtensor_cfg.enabled and _v2
alt DTensor enabled and _v2 true
Policy->>DTensorV2: save checkpoint (v2)
DTensorV2-->>Policy: result
else DTensor not enabled or _v2 false
Policy->>SafeTensors: validate (if safetensors)
SafeTensors-->>Policy: ok/error
Policy->>LegacySave: save checkpoint (non-v2)
LegacySave-->>Policy: result
end
Policy-->>User: completion
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Pre-merge checks and finishing touches❌ Failed checks (2 warnings, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
nemo_rl/models/policy/lm_policy.py (1)
649-667: Reuse init-stored DTensor flags and declare_v2in the TypedDict to avoid KeyErrorDTensorConfig (nemo_rl/models/policy/init.py — DTensorConfig at ~lines 20–22) currently defines
enabledandenv_varsbut not_v2. lm_policy.py (around lines 649–667) indexes intocfg["dtensor_cfg"]["_v2"]and recomputes flags — this risks KeyError and duplicates validation.
- Replace local recomputation in nemo_rl/models/policy/lm_policy.py with the init-stored flag:
- use_dtensor = "dtensor_cfg" in self.cfg and self.cfg["dtensor_cfg"]["enabled"] - use_dtensor_v2 = use_dtensor and self.cfg["dtensor_cfg"]["_v2"] + use_dtensor_v2 = getattr(self, "dtensor_v2", False)
- Set flags once in init after computing dtensor_enable / use_v2:
self.dtensor_enabled = dtensor_enable self.dtensor_v2 = use_v2 if dtensor_enable else False
- Make
_v2explicit in nemo_rl/models/policy/init.py::DTensorConfig — add either_v2: boolif mandatory, or_v2: NotRequired[bool]if optional; alternatively change uses tocfg.get("_v2", False).
🧹 Nitpick comments (2)
nemo_rl/models/policy/lm_policy.py (2)
80-82: Raise clear errors when backendenabledflags are missing.Direct indexing will raise a KeyError. Fail fast with a descriptive message to align with “no hidden defaults” while improving UX.
- megatron_enable = "megatron_cfg" in config and config["megatron_cfg"]["enabled"] - dtensor_enable = "dtensor_cfg" in config and config["dtensor_cfg"]["enabled"] + if "megatron_cfg" in config and "enabled" not in config["megatron_cfg"]: + raise ValueError("policy.megatron_cfg.enabled must be set (true/false).") + if "dtensor_cfg" in config and "enabled" not in config["dtensor_cfg"]: + raise ValueError("policy.dtensor_cfg.enabled must be set (true/false).") + megatron_enable = "megatron_cfg" in config and config["megatron_cfg"]["enabled"] + dtensor_enable = "dtensor_cfg" in config and config["dtensor_cfg"]["enabled"]
103-106: Outdated comment and missing-key ergonomics for_v2.Comment claims a default, but the code now requires explicit presence. Make the requirement explicit and raise a helpful error if
_v2is absent (or not a bool).- # Check if _v2 is enabled in dtensor_cfg (defaults to False for backward compatibility) - use_v2 = config["dtensor_cfg"]["_v2"] + # Require explicit DTensor version selection to avoid silent fallback. + if "_v2" not in config["dtensor_cfg"]: + raise ValueError( + "policy.dtensor_cfg._v2 must be set: true for DTensor v2, false for v1." + ) + use_v2 = config["dtensor_cfg"]["_v2"] + if not isinstance(use_v2, bool): + raise TypeError("policy.dtensor_cfg._v2 must be a bool.")
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
nemo_rl/models/policy/lm_policy.py(3 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
**/*.py: Follow the Google Python Style Guide for all Python code
Target Python 3.12+ for all Python code in NeMo-RL
Indent Python code with 4 spaces; do not use tabs
Python filenames should be snake_case (e.g., some_file.py)
Class names should be PascalCase
Function and method names should be snake_case
Local variable names should be snake_case; if starting with a number, prefix with k (e.g., k_99th_percentile)
Global variables should be UPPER_SNAKE_CASE and prefixed with G_ (e.g., G_MY_GLOBAL)
Constants should be UPPER_SNAKE_CASE
Avoid shadowing variables declared in an outer scope
Initialize all externally visible members of a class in the constructor
For public interfaces used outside a file, prefer docstrings over comments
Use comments mainly for code within a function or interfaces local to a file
Commented-out code must include a nearby comment explaining usage and why it is commented out; otherwise remove before merging
Use Google-style docstrings for classes and functions (Sphinx-parseable)
Avoid using reflection when functionality can be easily achieved without it
Limit except clauses to the smallest specific set of exceptions possible
For duck-typing via try/except, keep the try body minimal and use else for main logic
Add the NVIDIA copyright header (with current year) at the top of all Python files, excluding tests/ and test-only scripts
Files:
nemo_rl/models/policy/lm_policy.py
nemo_rl/**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
nemo_rl/**/*.py: Do not set non-None configuration defaults in code; YAML is the single source of truth for defaults
Access required config attributes directly (e.g., policy_cfg["precision"]) and assume presence; do not introduce hidden defaults
Express configuration optionality via TypedDict using typing.NotRequired
When adding a new config key to a TypedDict subclass, document the key’s purpose, valid values/types, and recommended default in code
For any class or function decorated with @ray.remote, add '# pragma: no cover' on the class/def line (and on remote functions)
Files:
nemo_rl/models/policy/lm_policy.py
🧠 Learnings (1)
📚 Learning: 2025-09-20T14:58:45.460Z
Learnt from: CR
PR: NVIDIA-NeMo/RL#0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-09-20T14:58:45.460Z
Learning: Applies to nemo_rl/**/*.py : Access required config attributes directly (e.g., policy_cfg["precision"]) and assume presence; do not introduce hidden defaults
Applied to files:
nemo_rl/models/policy/lm_policy.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: build-container / main
- GitHub Check: sphinx-build / Build docs
- GitHub Check: Lint check
- GitHub Check: Lint check
- GitHub Check: Post automodel integration comment / Comment on PR
- GitHub Check: Post submodule check comment / Comment on PR
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
d18bc5c to
3bd4ae9
Compare
|
closed as this will be done in #1024. |
|
|
||
| megatron_enable = bool(config.get("megatron_cfg", {}).get("enabled", False)) | ||
| dtensor_enable = bool(config.get("dtensor_cfg", {}).get("enabled", False)) | ||
| megatron_enable = "megatron_cfg" in config and config["megatron_cfg"]["enabled"] |
There was a problem hiding this comment.
Will enabled be sure to exist? This is different from original code. It will default to False if enabled doesn't exist.
|
|
||
| # Check if _v2 is enabled in dtensor_cfg (defaults to False for backward compatibility) | ||
| use_v2 = config.get("dtensor_cfg", {}).get("_v2", False) | ||
| use_v2 = config["dtensor_cfg"]["_v2"] |
There was a problem hiding this comment.
Similar, I guess you want to ensure this field must be set.
Remove some default value in code.
Raise error when forgetting to set
_v2in config instead of using dtensor v1 by default._v2is set to true in example configs since it's the recommend one, and set to false in all recipes to make them the same as before.Summary by CodeRabbit