cp: `use pydantic for yaml test validation` (#1382) into `r0.4.0` by chtruong814 · Pull Request #1459 · NVIDIA-NeMo/RL

chtruong814 · 2025-10-31T23:12:47Z

What does this PR do ?

cp: use pydantic for yaml test validation (#1382) into r0.4.0

Issues

List issues that this PR closes (syntax):

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

...

Summary by CodeRabbit

Release Notes

Configuration Updates
- Enhanced configuration validation with improved error messaging for invalid configurations.
- Updated generation and model configurations to support optional parameters and stricter type validation.
- Refined recipe configurations for various training and evaluation scenarios.
Evaluation Support
- Expanded evaluation configuration options for multiple benchmark datasets and custom math evaluation.
Internal Improvements
- Optimized padding token handling across model generation pipeline.
- Refined distributed training configuration options.
- Updated tooling for configuration management and validation.

github-actions · 2025-11-01T06:22:55Z

ℹ️ File Consistency Check

Check based on commit: b7ef96f (PR #1459 from chtruong/cp-1382-r0.4.0)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

nemo_rl/models/policy/dtensor_policy_worker.py
nemo_rl/models/policy/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2025-11-01T06:23:18Z

✅ Submodule Fast-Forward Check Results

Check based on commit: b7ef96f (PR #1459 from chtruong/cp-1382-r0.4.0)

✅ Submodules that are properly updated:

Automodel: ✅ PR branch is ahead of r0.4.0 branch (fast-forward)

All submodule changes look good! ✨

coderabbitai · 2025-11-01T06:32:15Z

📝 Walkthrough

Walkthrough

This PR refactors configuration schemas across the codebase, introducing Disabled variants for policy configs, expanding eval dataset types, updating generation code to use internal _pad_token_id keys, converting config validation to Pydantic, and adding distillation support to CLI tooling and pre-commit hooks.

Changes

Cohort / File(s)	Summary
Pre-commit and CLI tooling `.pre-commit-config.yaml`, `tools/config_cli.py`	Added distillation minimize-check entries to pre-commit hooks; wired distillation algorithm to use new base config path in CLI workflow.
Distillation configurations `examples/configs/distillation_math.yaml`, `examples/configs/distillation_math_megatron.yaml`	Changed `defer_fp32_logits` from `null` to `False` in Megatron config.
Base algorithm configs `examples/configs/{dpo.yaml, grpo_math_1B.yaml, sft.yaml, sft_openmathinstruct2_megatron.yaml, vlm_grpo_3B.yaml, vlm_grpo_3B_megatron.yaml}`	Added `defer_fp32_logits`, `env_vars`, `use_custom_fsdp`, and `hf_config_overrides` fields across Megatron and DDP configurations.
GRPO and distillation recipes `examples/configs/recipes/llm/{distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml, grpo-llama3\\.yaml, grpo-qwen2.5-.yaml, grpo-llama3.2-1b-*.yaml}`	Removed `stop_token_ids` configuration from generation settings and `save_period` from checkpointing.
Data and environment typing `nemo_rl/data/__init__.py`, `nemo_rl/environments/math_environment.py`	Introduced new `EvalDataConfig` TypedDict variants (MMLU, MMLUPro, AIME, GPQA, Math, LocalMath); removed `MathDataConfig`; updated optional field typing for `prompt_file`, `system_prompt_file`, `stop_strings`, `verifier_type`.
Eval configuration `nemo_rl/evals/eval.py`	Updated imports and TypedDict signatures to use `EvalDataConfigType`; introduced `_PassThroughMathConfig` wrapper for backward compatibility.
Generation interfaces and typing `nemo_rl/models/generation/interfaces.py`, `docs/design-docs/generation.md`	Introduced `OptionalResourcesConfig`; updated `GenerationConfig` to allow `None` for `top_k`, `stop_token_ids`, `stop_strings`, `model_name`; added internal `_pad_token_id` field.
Generation initialization `nemo_rl/models/generation/__init__.py`	Added warning guard for `_pad_token_id` override before setting internal padding key.
vLLM generation implementations `nemo_rl/models/generation/vllm/{vllm_generation.py, vllm_worker.py, vllm_worker_async.py}`	Migrated from `pad_token_id` to `_pad_token_id`; made `model_name` required; switched `top_k` from safe-get to direct access.
Policy configuration types `nemo_rl/models/policy/__init__.py`	Introduced Disabled variants (`DTensorConfigDisabled`, `SequencePackingConfigDisabled`, `MegatronConfigDisabled`, `DynamicBatchingConfigDisabled`); updated enabled flags to `Literal[True]`; added `defer_fp32_logits`, `sequence_length_round`; broadened optional field typing; made optimizer/scheduler required in `MegatronConfig`.
Policy implementations `nemo_rl/models/policy/{lm_policy.py, megatron_policy_worker.py}`	Updated `pad_value_dict` to use `_pad_token_id`; added assertion requiring `defer_fp32_logits: True` when `logprob_chunk_size` is set; broadened `model_save_format` guard logic.
Rollouts and loss functions `nemo_rl/experience/rollouts.py`, `nemo_rl/algorithms/loss_functions.py`	Switched padding token source from policy config to tokenizer; updated `ratio_clip_c` type to `float
Checkpoint utilities `nemo_rl/utils/checkpoint.py`	Updated `CheckpointingConfig.model_save_format` type to `str
Test updates `tests/unit/models/generation/{test_vllm_generation.py, test_vllm_large_model.py}`, `tests/unit/test_config_validation.py`, `tests/unit/test_recipes_and_test_suites.py`	Migrated `pad_token_id` to `_pad_token_id` in async tests; refactored config validation to use Pydantic `TypeAdapter`; introduced dynamic `MasterConfig` variant selection; removed recipe merge test.
Project configuration `pyproject.toml`	Renamed Ruff section from `[tool.ruff.per-file-ignores]` to `[tool.ruff.lint.per-file-ignores]`.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Policy Config
    participant Generation Init
    participant vLLM Worker
    participant Output Processor

    User->>Policy Config: set generation.top_k (now int|None)
    Policy Config->>Generation Init: load config
    rect rgb(200, 220, 255)
        Note over Generation Init: Check if _pad_token_id exists<br/>Warn if override needed<br/>Set internal _pad_token_id
    end
    Generation Init->>vLLM Worker: pass config with _pad_token_id
    rect rgb(220, 200, 255)
        Note over vLLM Worker: Use _pad_token_id<br/>(not public pad_token_id)
    end
    vLLM Worker->>Output Processor: batch results with _pad_token_id
    Output Processor->>User: padded sequences

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Policy configuration refactoring (nemo_rl/models/policy/__init__.py): extensive TypedDict additions with Disabled variants and Literal-based enabled flags require careful verification of all field transitions and default behaviors.
Pad token key migration (vLLM generation stack, tests, policy): verify all references to pad_token_id vs _pad_token_id are consistent across padding, batching, and output handling code paths.
Config validation refactoring (tests/unit/test_config_validation.py): complete rewrite to Pydantic-based approach with dynamic MasterConfig variant selection; test file discovery and mapping logic require thorough review.
Data config typing changes (nemo_rl/data/__init__.py): removed MathDataConfig, introduced 6 new eval config TypedDicts with Literal constraints; verify all downstream code handles new EvalDataConfigType union correctly.
Megatron config assertions (nemo_rl/models/policy/megatron_policy_worker.py): new dependency requirement between defer_fp32_logits and logprob_chunk_size should be validated for backward compatibility impact.

Possibly related PRs

chore: use pydantic for yaml test validation #1382: Both PRs modify generation config typing (top_k, pad_token_id → _pad_token_id), Megatron defer_fp32_logits settings, and refactor config validation to use Pydantic-style approaches.
fix: Fixes to make Megatron backend match dtensor #1389: Both PRs update the same base Megatron/distillation configuration files (distillation_math.yaml, distillation_math_megatron.yaml) with defer_fp32_logits and DDP-related settings.
feat: add config_cli.py and refactor configs + config pre-commit #1024: Main PR extends pre-commit hooks and wires distillation support into CLI config tooling (config_cli.py), directly building on the minimize-check infrastructure introduced in feat: add config_cli.py and refactor configs + config pre-commit #1024.

Suggested labels

cherry-pick, Run CICD, r0.4.0, CI:L1

Suggested reviewers

yuki-97
parthchadha

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 76.47% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Test Results For Major Changes	⚠️ Warning	This PR contains major changes, including significant refactoring of the YAML config validation infrastructure to use Pydantic, extensive type signature updates across multiple modules, introduction of new disabled configuration variants, and removal of existing test functions. These changes represent breaking changes and substantial refactoring. However, the PR description does not include test results, testing information, or any demonstration that regressions have been avoided. The PR objectives note that the PR body contains mostly placeholder content with an unchecked testing checklist item. For a cherry-pick of a substantial feature like moving to Pydantic-based validation, test documentation is essential to verify the changes work correctly and don't introduce regressions.	The PR description must be updated to include test results demonstrating that the Pydantic validation migration works correctly and that existing configs continue to validate properly. Additionally, if this refactoring affects performance or convergence, before-and-after metrics should be provided with configuration context. The unchecked testing checklist item should be addressed by providing evidence of test execution and results.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The PR title "cp: `use pydantic for yaml test validation` (#1382) into `r0.4.0`" accurately references a significant and real change present in the changeset. The raw_summary confirms that tests/unit/test_config_validation.py was substantially refactored to replace custom TypedDict validation with Pydantic-based validation using TypeAdapter, which directly aligns with the title's focus. However, the changeset extends well beyond test validation improvements and includes numerous other substantial modifications across the codebase, including type system updates, configuration changes, generation pipeline modifications, padding token handling, and policy configuration restructuring. The title captures one important aspect of the PR but does not represent the full scope of changes.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch chtruong/cp-1382-r0.4.0

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Terry Kong <terryk@nvidia.com>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

nemo_rl/models/generation/__init__.py (1)

30-36: Add stacklevel to warning for better debugging.

The guarded pattern and warning are good, but the warning should include stacklevel=2 to point to the caller's line rather than the warning line itself.

Apply this diff:
     if "_pad_token_id" in config:
         warnings.warn(
             "'_pad_token_id' found in generation config and will be overridden with tokenizer.pad_token_id. "
             "Note: '_pad_token_id' is intended for internal use and has no effect when set in user-provided configs.",
             UserWarning,
+            stacklevel=2,
         )
     config["_pad_token_id"] = tokenizer.pad_token_id

tests/unit/test_config_validation.py (1)

113-128: Drop the unused config_type.

config_type is assigned in every branch but never read, so it just trips lint (F841). Please remove the variable (and the assignments) to keep the test file clean.

-    master_config_class = None
-    config_type = None
+    master_config_class = None
@@
-        master_config_class = EvalMasterConfig
-        config_type = "eval"
+        master_config_class = EvalMasterConfig
@@
-        master_config_class = DistillationMasterConfig
-        config_type = "distillation"
+        master_config_class = DistillationMasterConfig
@@
-        master_config_class = DPOMasterConfig
-        config_type = "dpo"
+        master_config_class = DPOMasterConfig
@@
-        master_config_class = SFTMasterConfig
-        config_type = "sft"
+        master_config_class = SFTMasterConfig
@@
-        master_config_class = GRPOMasterConfig
-        config_type = "grpo"
+        master_config_class = GRPOMasterConfig
@@
-        master_config_class = RMMasterConfig
-        config_type = "rm"
+        master_config_class = RMMasterConfig

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e941b4e and 7c7f221.

📒 Files selected for processing (42)

.pre-commit-config.yaml (1 hunks)
docs/design-docs/generation.md (1 hunks)
examples/configs/distillation_math.yaml (1 hunks)
examples/configs/distillation_math_megatron.yaml (1 hunks)
examples/configs/dpo.yaml (2 hunks)
examples/configs/grpo_math_1B.yaml (2 hunks)
examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml (0 hunks)
examples/configs/recipes/llm/grpo-llama3.1-8b-instruct-1n8g-megatron-fp8-e2e.yaml (0 hunks)
examples/configs/recipes/llm/grpo-llama3.1-8b-instruct-1n8g-megatron-fp8-rollouts.v3.yaml (0 hunks)
examples/configs/recipes/llm/grpo-llama3.1-8b-instruct-2n8g-fsdp2tp1-noncolocated.yaml (0 hunks)
examples/configs/recipes/llm/grpo-llama3.1-8b-instruct-4n8g-fsdp2tp1-long.v3.yaml (0 hunks)
examples/configs/recipes/llm/grpo-llama3.2-1b-instruct-1n8g-fsdp2tp1.v3.yaml (0 hunks)
examples/configs/recipes/llm/grpo-llama3.2-1b-instruct-1n8g-megatron.yaml (0 hunks)
examples/configs/recipes/llm/grpo-qwen2.5-32b-32n8g-fsdp2tp8sp-actckpt-long.v3.yaml (0 hunks)
examples/configs/recipes/llm/grpo-qwen2.5-32b-32n8g-fsdp2tp8sp-actckpt.v3.yaml (0 hunks)
examples/configs/recipes/llm/grpo-qwen2.5-7b-instruct-4n8g-fsdp2tp4sp.v3.yaml (0 hunks)
examples/configs/recipes/llm/grpo-qwen2.5-7b-instruct-4n8g-megatron.yaml (0 hunks)
examples/configs/recipes/llm/grpo-qwen2.5-math-1.5b-instruct-1n8g-fsdp2tp1.v3.yaml (0 hunks)
examples/configs/sft.yaml (3 hunks)
examples/configs/sft_openmathinstruct2_megatron.yaml (1 hunks)
examples/configs/vlm_grpo_3B.yaml (1 hunks)
examples/configs/vlm_grpo_3B_megatron.yaml (1 hunks)
nemo_rl/algorithms/loss_functions.py (1 hunks)
nemo_rl/data/__init__.py (2 hunks)
nemo_rl/environments/math_environment.py (2 hunks)
nemo_rl/evals/eval.py (2 hunks)
nemo_rl/experience/rollouts.py (1 hunks)
nemo_rl/models/generation/__init__.py (2 hunks)
nemo_rl/models/generation/interfaces.py (2 hunks)
nemo_rl/models/generation/vllm/vllm_generation.py (4 hunks)
nemo_rl/models/generation/vllm/vllm_worker.py (2 hunks)
nemo_rl/models/generation/vllm/vllm_worker_async.py (2 hunks)
nemo_rl/models/policy/__init__.py (5 hunks)
nemo_rl/models/policy/lm_policy.py (2 hunks)
nemo_rl/models/policy/megatron_policy_worker.py (3 hunks)
nemo_rl/utils/checkpoint.py (2 hunks)
pyproject.toml (1 hunks)
tests/unit/models/generation/test_vllm_generation.py (1 hunks)
tests/unit/models/generation/test_vllm_large_model.py (1 hunks)
tests/unit/test_config_validation.py (1 hunks)
tests/unit/test_recipes_and_test_suites.py (1 hunks)
tools/config_cli.py (1 hunks)

💤 Files with no reviewable changes (12)

examples/configs/recipes/llm/grpo-llama3.1-8b-instruct-2n8g-fsdp2tp1-noncolocated.yaml
examples/configs/recipes/llm/grpo-qwen2.5-7b-instruct-4n8g-megatron.yaml
examples/configs/recipes/llm/grpo-llama3.2-1b-instruct-1n8g-megatron.yaml
examples/configs/recipes/llm/grpo-qwen2.5-math-1.5b-instruct-1n8g-fsdp2tp1.v3.yaml
examples/configs/recipes/llm/grpo-qwen2.5-32b-32n8g-fsdp2tp8sp-actckpt.v3.yaml
examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml
examples/configs/recipes/llm/grpo-llama3.1-8b-instruct-4n8g-fsdp2tp1-long.v3.yaml
examples/configs/recipes/llm/grpo-llama3.2-1b-instruct-1n8g-fsdp2tp1.v3.yaml
examples/configs/recipes/llm/grpo-llama3.1-8b-instruct-1n8g-megatron-fp8-e2e.yaml
examples/configs/recipes/llm/grpo-qwen2.5-32b-32n8g-fsdp2tp8sp-actckpt-long.v3.yaml
examples/configs/recipes/llm/grpo-llama3.1-8b-instruct-1n8g-megatron-fp8-rollouts.v3.yaml
examples/configs/recipes/llm/grpo-qwen2.5-7b-instruct-4n8g-fsdp2tp4sp.v3.yaml

🧰 Additional context used

📓 Path-based instructions (4)

docs/**/*.md

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

When a markdown doc under docs/**/*.md is added or renamed, update docs/index.md to include it in the appropriate section

Files:

docs/design-docs/generation.md

examples/configs/*.yaml

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

examples/configs/*.yaml: Exemplar configs under examples/configs/.yaml must include documented defaults
When adding a new config key, reflect its recommended default in exemplar YAMLs under examples/configs/.yaml

Files:

examples/configs/grpo_math_1B.yaml
examples/configs/vlm_grpo_3B.yaml
examples/configs/distillation_math.yaml
examples/configs/sft_openmathinstruct2_megatron.yaml
examples/configs/distillation_math_megatron.yaml
examples/configs/vlm_grpo_3B_megatron.yaml
examples/configs/dpo.yaml
examples/configs/sft.yaml

**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: Follow the Google Python Style Guide for all Python code
Target Python 3.12+ for all Python code in NeMo-RL
Indent Python code with 4 spaces; do not use tabs
Python filenames should be snake_case (e.g., some_file.py)
Class names should be PascalCase
Function and method names should be snake_case
Local variable names should be snake_case; if starting with a number, prefix with k (e.g., k_99th_percentile)
Global variables should be UPPER_SNAKE_CASE and prefixed with G_ (e.g., G_MY_GLOBAL)
Constants should be UPPER_SNAKE_CASE
Avoid shadowing variables declared in an outer scope
Initialize all externally visible members of a class in the constructor
For public interfaces used outside a file, prefer docstrings over comments
Use comments mainly for code within a function or interfaces local to a file
Commented-out code must include a nearby comment explaining usage and why it is commented out; otherwise remove before merging
Use Google-style docstrings for classes and functions (Sphinx-parseable)
Avoid using reflection when functionality can be easily achieved without it
Limit except clauses to the smallest specific set of exceptions possible
For duck-typing via try/except, keep the try body minimal and use else for main logic
Add the NVIDIA copyright header (with current year) at the top of all Python files, excluding tests/ and test-only scripts

Files:

nemo_rl/algorithms/loss_functions.py
nemo_rl/experience/rollouts.py
nemo_rl/models/policy/lm_policy.py
tests/unit/models/generation/test_vllm_generation.py
nemo_rl/models/generation/__init__.py
tests/unit/test_config_validation.py
tools/config_cli.py
nemo_rl/models/generation/vllm/vllm_worker.py
nemo_rl/models/policy/megatron_policy_worker.py
nemo_rl/utils/checkpoint.py
nemo_rl/environments/math_environment.py
tests/unit/models/generation/test_vllm_large_model.py
nemo_rl/models/generation/vllm/vllm_worker_async.py
nemo_rl/data/__init__.py
nemo_rl/models/generation/vllm/vllm_generation.py
nemo_rl/models/generation/interfaces.py
nemo_rl/evals/eval.py
tests/unit/test_recipes_and_test_suites.py
nemo_rl/models/policy/__init__.py

nemo_rl/**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

nemo_rl/**/*.py: Do not set non-None configuration defaults in code; YAML is the single source of truth for defaults
Access required config attributes directly (e.g., policy_cfg["precision"]) and assume presence; do not introduce hidden defaults
Express configuration optionality via TypedDict using typing.NotRequired
When adding a new config key to a TypedDict subclass, document the key’s purpose, valid values/types, and recommended default in code
For any class or function decorated with @ray.remote, add '# pragma: no cover' on the class/def line (and on remote functions)

Files:

nemo_rl/algorithms/loss_functions.py
nemo_rl/experience/rollouts.py
nemo_rl/models/policy/lm_policy.py
nemo_rl/models/generation/__init__.py
nemo_rl/models/generation/vllm/vllm_worker.py
nemo_rl/models/policy/megatron_policy_worker.py
nemo_rl/utils/checkpoint.py
nemo_rl/environments/math_environment.py
nemo_rl/models/generation/vllm/vllm_worker_async.py
nemo_rl/data/__init__.py
nemo_rl/models/generation/vllm/vllm_generation.py
nemo_rl/models/generation/interfaces.py
nemo_rl/evals/eval.py
nemo_rl/models/policy/__init__.py

🧠 Learnings (15)

📚 Learning: 2025-09-20T14:58:45.492Z

Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-09-20T14:58:45.492Z
Learning: Applies to examples/configs/recipes/**/*.{yaml,sh} : Known exception: Deepscaler recipes may encode context length in place of the cluster tuple (e.g., grpo-deepscaler-1.5b-8K.*); allowed but document intended hardware in the script

Applied to files:

.pre-commit-config.yaml
tools/config_cli.py

📚 Learning: 2025-09-20T14:58:45.492Z

Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-09-20T14:58:45.492Z
Learning: Applies to examples/configs/recipes/vlm/*.yaml : VLM recipe YAML filenames must follow: vlm_<algo>-<model>-<nodes>n<gpus>g-<strategy>[-modifiers][.vN].yaml

Applied to files:

.pre-commit-config.yaml
tools/config_cli.py
tests/unit/test_recipes_and_test_suites.py

📚 Learning: 2025-09-20T14:58:45.492Z

Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-09-20T14:58:45.492Z
Learning: Applies to examples/configs/recipes/**/*.yaml : When adding support for a new model, add a recipe YAML under examples/configs/recipes/ in the appropriate domain (llm/ or vlm/) with the correct name

Applied to files:

.pre-commit-config.yaml
tools/config_cli.py

📚 Learning: 2025-09-20T14:58:45.492Z

Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-09-20T14:58:45.492Z
Learning: Applies to examples/configs/recipes/llm/*.yaml : LLM recipe YAML filenames must follow: <algo>-<model>-<nodes>n<gpus>g-<strategy-and-params>[-modifiers][-long][.vN].yaml

Applied to files:

.pre-commit-config.yaml
tools/config_cli.py
tests/unit/test_recipes_and_test_suites.py

📚 Learning: 2025-09-20T14:58:45.492Z

Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-09-20T14:58:45.492Z
Learning: Applies to examples/configs/recipes/**/*.yaml : Recipe YAMLs under examples/configs/recipes/** are runnable snapshots and may omit documentation

Applied to files:

.pre-commit-config.yaml

📚 Learning: 2025-09-20T14:58:45.492Z

Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-09-20T14:58:45.492Z
Learning: Applies to tests/test_suites/vlm/*.sh : VLM driver script filenames must mirror the YAML base name and follow the same pattern with .sh extension

Applied to files:

.pre-commit-config.yaml

📚 Learning: 2025-10-30T20:50:44.126Z

Learnt from: adil-a
Repo: NVIDIA-NeMo/RL PR: 1440
File: examples/configs/sft_automodel.yaml:48-58
Timestamp: 2025-10-30T20:50:44.126Z
Learning: In DTensor configurations for MoE (Mixture of Experts) models, expert_parallel_size and data_parallel_size can be applied together without multiplying the GPU requirements. Expert Parallelism (EP) only applies to MoE layers, while Data Parallelism/FSDP applies to non-MoE layers. Therefore, configurations like expert_parallel_size: 8 and data_parallel_size: 8 are valid on an 8-GPU cluster for MoE models.

Applied to files:

examples/configs/sft_openmathinstruct2_megatron.yaml

📚 Learning: 2025-09-10T05:29:34.349Z

Learnt from: bxyu-nvidia
Repo: NVIDIA-NeMo/RL PR: 1110
File: nemo_rl/models/generation/vllm/vllm_worker_async.py:98-105
Timestamp: 2025-09-10T05:29:34.349Z
Learning: In the _maybe_correct_merged_tokens function in nemo_rl/models/generation/vllm/vllm_worker_async.py, the loop condition `len(candidate_token_ids) < len(actual_token_ids) - 1` is intentionally designed to prevent accessing the final token in actual_token_ids, likely to handle specific tokenization edge cases in the vLLM HTTP server integration.

Applied to files:

tests/unit/models/generation/test_vllm_generation.py
nemo_rl/models/generation/vllm/vllm_worker.py
tests/unit/models/generation/test_vllm_large_model.py
nemo_rl/models/generation/vllm/vllm_worker_async.py

📚 Learning: 2025-09-10T05:34:35.406Z

Learnt from: bxyu-nvidia
Repo: NVIDIA-NeMo/RL PR: 1110
File: nemo_rl/models/generation/vllm/vllm_worker_async.py:346-359
Timestamp: 2025-09-10T05:34:35.406Z
Learning: In nemo_rl/models/generation/vllm/vllm_worker_async.py, the HTTP server intentionally uses different path structures: `/v1/chat/completions` is under the `/v1` prefix while `/tokenize` is at the root level without the `/v1` prefix. This is the intended design.

Applied to files:

tests/unit/models/generation/test_vllm_generation.py
nemo_rl/models/generation/vllm/vllm_worker.py
nemo_rl/models/generation/vllm/vllm_worker_async.py

📚 Learning: 2025-09-19T03:00:58.662Z

Learnt from: shuo-nvidia
Repo: NVIDIA-NeMo/RL PR: 1006
File: examples/configs/recipes/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-fsdp2tp1.v1.yaml:85-101
Timestamp: 2025-09-19T03:00:58.662Z
Learning: In distillation and GRPO configurations, max_new_tokens is intentionally set to the full context window (max_total_sequence_length) for consistency across the codebase. Overflow cases when prompt + generation tokens exceed max_model_len are handled by safeguards implemented in vllm_worker.py.

Applied to files:

tests/unit/models/generation/test_vllm_generation.py
nemo_rl/models/generation/vllm/vllm_worker.py
tests/unit/models/generation/test_vllm_large_model.py
nemo_rl/models/generation/vllm/vllm_worker_async.py

📚 Learning: 2025-09-20T14:58:45.492Z

Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-09-20T14:58:45.492Z
Learning: Applies to nemo_rl/**/*.py : When adding a new config key to a TypedDict subclass, document the key’s purpose, valid values/types, and recommended default in code

Applied to files:

tests/unit/test_config_validation.py
nemo_rl/utils/checkpoint.py
nemo_rl/data/__init__.py
nemo_rl/models/generation/interfaces.py
nemo_rl/evals/eval.py
nemo_rl/models/policy/__init__.py

📚 Learning: 2025-09-18T14:57:31.003Z

Learnt from: zpqiu
Repo: NVIDIA-NeMo/RL PR: 1006
File: nemo_rl/algorithms/distillation.py:312-354
Timestamp: 2025-09-18T14:57:31.003Z
Learning: The distillation algorithm's cluster setup logic is designed to follow the same patterns used in GRPO for handling distributed training clusters and resource allocation.

Applied to files:

tools/config_cli.py

📚 Learning: 2025-09-17T01:52:21.399Z

Learnt from: ffrujeri
Repo: NVIDIA-NeMo/RL PR: 1023
File: nemo_rl/utils/checkpoint.py:58-65
Timestamp: 2025-09-17T01:52:21.399Z
Learning: model_state_dict_keys is not intended to be part of the nemo-rl CheckpointingConfig TypedDict - it's handled at the automodel implementation layer, not as a general checkpointing configuration parameter.

Applied to files:

nemo_rl/utils/checkpoint.py
nemo_rl/models/policy/__init__.py

📚 Learning: 2025-09-20T14:58:45.492Z

Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-09-20T14:58:45.492Z
Learning: Applies to nemo_rl/**/*.py : Express configuration optionality via TypedDict using typing.NotRequired

Applied to files:

nemo_rl/environments/math_environment.py
nemo_rl/data/__init__.py
nemo_rl/models/generation/interfaces.py
nemo_rl/models/policy/__init__.py

📚 Learning: 2025-09-19T02:44:38.451Z

Learnt from: shuo-nvidia
Repo: NVIDIA-NeMo/RL PR: 1006
File: examples/configs/recipes/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-fsdp2tp1.v1.yaml:73-84
Timestamp: 2025-09-19T02:44:38.451Z
Learning: The scheduler configuration format with a separate "milestones: [20]" entry (not wrapped under name/kwargs) is a valid and established pattern used across GRPO, DPO, and distillation configs in the NeMo RL codebase. This format specifies transition points between different schedulers (e.g., LinearLR for warmup steps, then ConstantLR).

Applied to files:

nemo_rl/models/policy/__init__.py

🧬 Code graph analysis (10)

nemo_rl/experience/rollouts.py (2)

tests/unit/environments/test_code_environment.py (1)

tokenizer (85-94)

tests/unit/environments/test_retriever.py (1)

tokenizer (84-93)

tests/unit/models/generation/test_vllm_generation.py (2)

tests/unit/environments/test_code_environment.py (1)

tokenizer (85-94)

tests/unit/environments/test_retriever.py (1)

tokenizer (84-93)

nemo_rl/models/generation/__init__.py (2)

tests/unit/models/generation/test_vllm_generation.py (1)

tokenizer (238-241)

tests/unit/models/generation/test_vllm_large_model.py (1)

tokenizer (82-85)

tests/unit/test_config_validation.py (4)

nemo_rl/evals/eval.py (1)

MasterConfig (57-63)

nemo_rl/algorithms/distillation.py (1)

MasterConfig (110-121)

nemo_rl/algorithms/grpo.py (1)

MasterConfig (161-169)

tools/config_cli.py (1)

load_config_with_inheritance (100-141)

nemo_rl/models/generation/vllm/vllm_worker.py (1)

nemo_rl/models/generation/interfaces.py (1)

verify_right_padding (23-99)

tests/unit/models/generation/test_vllm_large_model.py (2)

tests/unit/environments/test_code_environment.py (1)

tokenizer (85-94)

tests/unit/environments/test_retriever.py (1)

tokenizer (84-93)

nemo_rl/models/generation/vllm/vllm_worker_async.py (1)

nemo_rl/models/generation/interfaces.py (1)

verify_right_padding (23-99)

nemo_rl/data/__init__.py (1)

tests/unit/data/test_data_processor.py (1)

system_prompt_file (191-195)

nemo_rl/evals/eval.py (3)

nemo_rl/environments/math_environment.py (1)

MathEnvConfig (42-46)

nemo_rl/models/generation/interfaces.py (1)

GenerationConfig (118-131)

nemo_rl/models/policy/__init__.py (1)

TokenizerConfig (129-133)

nemo_rl/models/policy/__init__.py (2)

nemo_rl/models/generation/interfaces.py (1)

GenerationConfig (118-131)

nemo_rl/models/policy/megatron_policy_worker.py (1)

freeze_moe_router (251-263)

🪛 Ruff (0.14.2)

nemo_rl/models/generation/__init__.py

31-31: No explicit stacklevel keyword argument found

Set stacklevel=2

(B028)

tests/unit/test_config_validation.py

49-49: Avoid specifying long messages outside the exception class

(TRY003)

104-104: Avoid specifying long messages outside the exception class

(TRY003)

127-127: Local variable config_type is assigned to but never used

Remove assignment to unused variable config_type

(F841)

129-131: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: build-container / main
GitHub Check: sphinx-build / Build docs
GitHub Check: Lint check
GitHub Check: Lint check
GitHub Check: Post submodule check comment / Comment on PR
GitHub Check: Post automodel integration comment / Comment on PR

🔇 Additional comments (24)

pyproject.toml (1)

249-249: Configuration change is compatible with Ruff 0.9.9.

Ruff 0.9.9 supports the [tool.ruff.lint.per-file-ignores] configuration format, so the restructuring at line 249 is valid and will not cause configuration parsing issues.

tests/unit/test_recipes_and_test_suites.py (2)

39-39: No issues found with the "rm" algorithm mapping addition.

The entry "rm": "examples/configs/rm.yaml" at line 39 is correct. The referenced config file exists (6.7K), and supporting recipe files are present (examples/run_rm.py, examples/run_grpo_rm.py, examples/configs/grpo_rm_1B.yaml). The addition properly follows the established pattern in ALGO_MAPPING_TO_BASE_YAML and integrates cleanly with the existing codebase.

33-41: Test removal was intentional and replaced with comprehensive Pydantic validation.

The removed test function test_all_recipes_can_merge_configs_with_base_config has been replaced by the new Pydantic-based validation in test_config_validation.py. The new approach:

Validates all config files using Pydantic's TypeAdapter against typed MasterConfig classes

Handles config merging through load_config_with_inheritance before validation

Provides comprehensive validation of the entire config structure, not just merge capability

Maintains allowed additional keys via ALLOWED_ADDITIONAL_CONFIG_KEYS for edge cases

The removal is justified and part of a systematic shift to Pydantic-based validation.

nemo_rl/models/policy/megatron_policy_worker.py (1)

640-647: No config issues found—assertion will not cause runtime failures.

The only recipe with active logprob_chunk_size: 2048 (grpo-math-qwen3-30ba3b-megatron-tp4-32k.yaml) already sets defer_fp32_logits: true at line 31. All other configs have logprob_chunk_size: null, so they bypass the assertion. The assertion is compatible with all existing configurations.

examples/configs/grpo_math_1B.yaml (1)

66-66: LGTM: Explicit defaults improve config clarity.

These changes make configuration defaults explicit:

hf_config_overrides: {} clarifies an empty dict rather than null

defer_fp32_logits: False provides an explicit boolean default

This aligns with the coding guideline that exemplar configs should document defaults.

Also applies to: 106-106

examples/configs/vlm_grpo_3B.yaml (1)

96-96: LGTM: Explicit boolean default.

Setting defer_fp32_logits: False explicitly aligns with the pattern across other Megatron configs in this PR.

examples/configs/vlm_grpo_3B_megatron.yaml (1)

135-135: LGTM: Consistent with config standardization.

The explicit defer_fp32_logits: False default maintains consistency across all Megatron-based configurations.

tests/unit/models/generation/test_vllm_generation.py (1)

440-440: LGTM: Internal padding key convention.

The switch to _pad_token_id (with underscore prefix) indicates this is now an internal configuration key. The fallback to tokenizer.pad_token_id provides good defensive handling.

This aligns with similar changes in vllm_worker_async.py and lm_policy.py.

examples/configs/sft.yaml (1)

38-38: LGTM: Explicit defaults for DTensor and Megatron configs.

These additions make configuration defaults explicit:

env_vars: {} under both dtensor_cfg and megatron_cfg clarifies environment variable configuration

defer_fp32_logits: False provides an explicit Megatron default

This improves config clarity and aligns with the guideline that YAML is the single source of truth for defaults.

Also applies to: 77-77, 96-96

examples/configs/distillation_math.yaml (1)

106-106: LGTM: Explicit Megatron default.

The change to defer_fp32_logits: False maintains consistency with other Megatron configurations across the codebase.

nemo_rl/models/generation/vllm/vllm_worker_async.py (1)

531-531: LGTM: Consistent internal padding key usage.

Both changes adopt the _pad_token_id internal key convention:

Line 531: For padding verification

Line 639: For tensor initialization with padding

This is consistent with the same refactoring in test_vllm_generation.py (line 440) and lm_policy.py (line 585).

Also applies to: 639-639
nemo_rl/models/policy/lm_policy.py (2)
585-585: LGTM: Internal padding key for generation output.

Using _pad_token_id from the generation config maintains consistency with the internal key convention adopted across vllm_worker_async.py and test files.

735-740: Verify broadened checkpoint validation logic.

The condition changed from checking a specific value to checking if model_save_format is not None:
# Before (implied): checked if model_save_format == "safetensors" (or similar)
# After: checks if model_save_format is not None
if (
    checkpointing_cfg is not None
    and checkpointing_cfg.get("model_save_format", None) is not None
):
    raise ValueError(
        "model_save_format must be None or omitted if using DTensorPolicyWorker (_v2=False)."
    )
This now rejects any non-None model_save_format for DTensorPolicyWorker (v1). Ensure this is the intended behavior and won't break existing workflows that might pass other format values.
docs/design-docs/generation.md (1)

19-19: LGTM! Documentation aligns with code.

The type annotation update correctly reflects that top_k is optional in the generation configuration.

tests/unit/models/generation/test_vllm_large_model.py (1)

171-171: LGTM! Consistent with internal padding token refactoring.

The change to use "_pad_token_id" aligns with the broader refactoring across the vLLM generation path. The fallback to tokenizer.pad_token_id ensures robustness.

nemo_rl/models/generation/vllm/vllm_worker.py (2)

539-539: LGTM! Internal padding key usage.

The change to use self.cfg["_pad_token_id"] for padding validation is consistent with the broader refactoring to use internal padding token keys across the vLLM generation path.

573-573: LGTM! Internal padding key usage.

Using self.cfg["_pad_token_id"] for constructing padded output tensors aligns with the internal padding token key refactoring throughout the codebase.

tools/config_cli.py (1)

49-55: LGTM! Distillation support added correctly.

The addition follows the existing pattern for GRPO and correctly maps the distillation algorithm to its base configuration file. This enables the minimize workflow for distillation recipes.

nemo_rl/experience/rollouts.py (1)

163-163: LGTM! Simplified padding token source.

The change to directly use tokenizer.pad_token_id simplifies the padding logic by treating the tokenizer as the single source of truth for padding tokens in async generation paths.

nemo_rl/environments/math_environment.py (2)

18-18: LGTM! Correct import for TypedDict optionality.

Adding NotRequired enables proper expression of optional configuration fields in the TypedDict.

44-46: LGTM! Proper TypedDict optionality pattern.

The change from Optional[...] to NotRequired[... | None] correctly expresses configuration optionality using TypedDict semantics, allowing fields to be omitted while also supporting explicit None values. As per coding guidelines.

nemo_rl/models/generation/vllm/vllm_generation.py (3)

120-123: LGTM! Good defensive check with clear explanation.

The explicit check for model_name with a helpful comment correctly handles the case where this field is NotRequired in the base GenerationConfig but required by VllmGenerationWorker.

453-453: LGTM! Consistent internal padding key usage.

Both changes correctly use "_pad_token_id" for padding values when combining worker results, aligning with the internal padding token key refactoring across the vLLM generation path.

Also applies to: 504-504

92-92: Verify that top_k is always present in config.

The change from .get("top_k") to ["top_k"] now requires top_k to be present in the config (though it can be None). Please verify that configure_generation_config always populates this key before VllmGeneration initialization.

coderabbitai · 2025-11-01T22:32:12Z

nemo_rl/algorithms/loss_functions.py

+    # Dual-clipping value (should be >1 if enabled; usually set to 3 empirically). None to disable.
+    ratio_clip_c: float | None


⚠️ Potential issue | 🟠 Major

Use NotRequired[float] instead of float | None for optional configuration.

The coding guidelines specify: "Express configuration optionality via TypedDict using typing.NotRequired". The current implementation uses float | None, which means the key must be present in the config (but can be None), whereas NotRequired[float] indicates the key may be absent entirely.

As per coding guidelines.

Apply this diff to align with the coding guidelines:

- # Dual-clipping value (should be >1 if enabled; usually set to 3 empirically). None to disable. - ratio_clip_c: float | None + # Dual-clipping value (should be >1 if enabled; usually set to 3 empirically). Omit or set to None to disable. + ratio_clip_c: NotRequired[float | None]

Additionally, update line 111 to handle the key's potential absence:

self.ratio_clip_c = cfg.get("ratio_clip_c", None) # set to None to disable dual-clipping

🤖 Prompt for AI Agents

In nemo_rl/algorithms/loss_functions.py around lines 42-43, change the TypedDict declaration for ratio_clip_c from a required field typed as "float | None" to an optional field using "NotRequired[float]" (add/import NotRequired from typing) so the config key may be omitted; then at line 111 update the assignment to pull the value with cfg.get("ratio_clip_c", None) so missing keys default to None (disable dual-clipping).

coderabbitai · 2025-11-01T22:32:12Z

nemo_rl/models/generation/interfaces.py

+    top_k: int | None
+    model_name: NotRequired[str]  # Not Required b/c GRPO writes this
+    stop_token_ids: list[int] | None
+    stop_strings: list[str] | None
    colocated: NotRequired[ColocationConfig]


⚠️ Potential issue | 🔴 Critical

Restore optionality for stop_strings.

Switching stop_strings from NotRequired to a required key (even if None is allowed) will start failing every existing YAML config that omits this field once the new Pydantic validation runs. This PR is supposed to be a cherry-pick, so we cannot introduce a breaking config change here. Please keep stop_strings optional (e.g., stop_strings: NotRequired[list[str] | None]) so legacy configs continue to load. As per coding guidelines

- stop_strings: list[str] | None + stop_strings: NotRequired[list[str] | None]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

top_k: int | None

model_name: NotRequired[str] # Not Required b/c GRPO writes this

stop_token_ids: list[int] | None

stop_strings: list[str] | None

colocated: NotRequired[ColocationConfig]

top_k: int | None

model_name: NotRequired[str] # Not Required b/c GRPO writes this

stop_token_ids: list[int] | None

stop_strings: NotRequired[list[str] | None]

colocated: NotRequired[ColocationConfig]

🤖 Prompt for AI Agents

In nemo_rl/models/generation/interfaces.py around lines 125 to 129, stop_strings was changed from an optional key to a required key which will break existing YAML configs that omit it; revert it to an optional field by declaring it as NotRequired[list[str] | None] so the key may be absent while still allowing None or a list when present, ensuring backward compatibility with legacy configs and matching the existing pattern used for colocated/model_name.

Signed-off-by: Terry Kong <terryk@nvidia.com>

chtruong814 requested review from a team as code owners October 31, 2025 23:12

github-actions bot added documentation Improvements or additions to documentation CI Relating to CI labels Oct 31, 2025

terrykong force-pushed the chtruong/cp-1382-r0.4.0 branch from ceb25fc to b7ef96f Compare November 1, 2025 06:22

terrykong force-pushed the chtruong/cp-1382-r0.4.0 branch from b7ef96f to 0a2fde0 Compare November 1, 2025 06:26

github-actions bot removed the CI Relating to CI label Nov 1, 2025

chore: use pydantic for yaml test validation (#1382)

7c7f221

Signed-off-by: Terry Kong <terryk@nvidia.com>

terrykong force-pushed the chtruong/cp-1382-r0.4.0 branch from 0a2fde0 to 7c7f221 Compare November 1, 2025 22:23

terrykong added the CI:L1 Run doctests, unit tests, and functional tests label Nov 1, 2025

terrykong temporarily deployed to nemo-ci November 1, 2025 22:24 — with GitHub Actions Inactive

coderabbitai bot reviewed Nov 1, 2025

View reviewed changes

terrykong temporarily deployed to nemo-ci November 1, 2025 22:40 — with GitHub Actions Inactive

fix mcore opt in vlm configs

1850877

Signed-off-by: Terry Kong <terryk@nvidia.com>

terrykong added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Nov 2, 2025

terrykong enabled auto-merge (squash) November 2, 2025 03:43

terrykong temporarily deployed to nemo-ci November 2, 2025 03:43 — with GitHub Actions Inactive

terrykong approved these changes Nov 2, 2025

View reviewed changes

terrykong temporarily deployed to nemo-ci November 2, 2025 03:44 — with GitHub Actions Inactive

terrykong temporarily deployed to nemo-ci November 2, 2025 06:05 — with GitHub Actions Inactive

terrykong merged commit bed7217 into r0.4.0 Nov 2, 2025
40 of 41 checks passed

terrykong deleted the chtruong/cp-1382-r0.4.0 branch November 2, 2025 09:05

coderabbitai bot mentioned this pull request Nov 9, 2025

perf: Random dataset with specified input and output sequence length #1453

Open

4 tasks

This was referenced Dec 11, 2025

Add Gym GRPO example for multi-step tool calling with Workplace Assistant environment #1625

Closed

feat: Necessary changes for Gym GRPO tutorial #1630

Merged

This was referenced Jan 15, 2026

feat: NeMo Gym refresh 20260113 #1773

Merged

ci: Add nightly and release tests for gb200 #1788

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cp: `use pydantic for yaml test validation` (#1382) into `r0.4.0`#1459

cp: `use pydantic for yaml test validation` (#1382) into `r0.4.0`#1459
terrykong merged 2 commits intor0.4.0from
chtruong/cp-1382-r0.4.0

chtruong814 commented Oct 31, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

github-actions bot commented Nov 1, 2025

Uh oh!

github-actions bot commented Nov 1, 2025

Uh oh!

coderabbitai bot commented Nov 1, 2025 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 1, 2025

Uh oh!

coderabbitai bot Nov 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# Dual-clipping value (should be >1 if enabled; usually set to 3 empirically). None to disable.
		ratio_clip_c: float \| None

Conversation

chtruong814 commented Oct 31, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Issues

Usage

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Release Notes

Uh oh!

github-actions bot commented Nov 1, 2025

ℹ️ File Consistency Check

✅ DTensor Policy Worker Synchronization Check

Uh oh!

github-actions bot commented Nov 1, 2025

✅ Submodule Fast-Forward Check Results

✅ Submodules that are properly updated:

Uh oh!

coderabbitai bot commented Nov 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chtruong814 commented Oct 31, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 1, 2025 •

edited

Loading