Conversation
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
📝 WalkthroughWalkthroughAdds Penguin integration: a new environment implementation, executable alias, and registry entry; introduces a 3rdparty Penguin workspace with packaging files and an install probe; updates root pyproject for workspace/dependency wiring; and adds unit tests with fixture scaffolding and sanity JSON data. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Test as Test/Caller
participant Penguin as Penguin Env
participant Engine as Penguin Engine (Async)
participant Server as HTTP Server
participant Nemo as NeMo RL Formatter
rect rgb(240,245,255)
note over Penguin: Initialization
Test->>Penguin: __init__(cfg)
Penguin->>Engine: setup(config)
Penguin->>Server: start()
Penguin-->>Test: ready
end
rect rgb(240,255,240)
note over Penguin,Engine: Rollouts
Test->>Penguin: run_rollouts(examples)
Penguin->>Engine: generate(examples)
Engine-->>Penguin: penguin_results
Penguin->>Nemo: _postprocess(penguin_results)
Nemo-->>Penguin: message_logs
Penguin-->>Test: message_logs
end
rect rgb(255,245,240)
note over Penguin: Teardown
Test->>Penguin: shutdown()
Penguin->>Server: stop()
Penguin-->>Test: done
end
sequenceDiagram
autonumber
participant Setup as setup.py
participant Submod as Penguin pyproject.toml
participant Cache as CACHED_DEPENDENCIES
participant Build as setuptools
Setup->>Submod: load dependencies
Submod-->>Setup: final_dependencies
Setup->>Cache: compare sets
alt mismatch
Setup-->>Setup: print diagnostics to stderr
Setup-->>Setup: exit(1)
else match
Setup-->>Setup: print consistency
Setup->>Build: setup(name, version, packages, install_requires)
Build-->>Setup: complete
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…n-env Signed-off-by: Brian Yu <bxyu@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (4)
tests/unit/environments/test_penguin.py (2)
47-52: Strengthen the stub test with meaningful assertions.The test only prints
config_typeswithout asserting any behavior. Consider adding assertions to verify that Penguin is properly installed and accessible, or remove this test if it's truly a placeholder.Example improvement:
def test_penguin_stub_module(): - print(f"Penguin test successfully run! Penguin config_types module: {config_types}") + from penguin import config_types + assert hasattr(config_types, '__name__'), "config_types module should be importable" + print(f"Penguin test successfully run! Penguin config_types module: {config_types}")
38-44: Remove unnecessary# noqa: F401— theconfig_typesimport on line 39 is used in the test body, so the# noqa: F401can be removed.nemo_rl/environments/penguin.py (2)
46-47: Brittle file path assertion could break on file moves.The assertion verifies that
__file__ends with"nemo_rl/environments/penguin.py". This is fragile and will break if the file is moved or renamed.Consider one of the following alternatives:
- Remove the assertion if it's only for debugging.
- Use a more robust path resolution without hardcoded strings:
- RELATIVE_PATH = "nemo_rl/environments/penguin.py" - assert __file__.endswith(RELATIVE_PATH) - - self.rh.start( - global_config_dict_parser_config=GlobalConfigDictParserConfig( - dotenv_path=Path(__file__.removesuffix(RELATIVE_PATH)).absolute() - / "penguin_env.yaml", + self.rh.start( + global_config_dict_parser_config=GlobalConfigDictParserConfig( + dotenv_path=Path(__file__).parent.parent.parent.absolute() + / "penguin_env.yaml",
94-95: Stub health check always returns True without verification.The
health_checkmethod returnsTrueunconditionally. Consider adding actual health verification logic (e.g., checking ifself.rhis started, if the head server is reachable, etc.) to make this more useful for monitoring.Example:
def health_check(self) -> bool: - return True + try: + # Verify that RunHelper is initialized and server is accessible + return self.rh is not None and self.head_server_config is not None + except Exception: + return False
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (9)
3rdparty/Penguin-workspace/is_penguin_installed.py(1 hunks)3rdparty/Penguin-workspace/pyproject.toml(1 hunks)3rdparty/Penguin-workspace/setup.py(1 hunks)nemo_rl/distributed/ray_actor_environment_registry.py(1 hunks)nemo_rl/distributed/virtual_cluster.py(1 hunks)nemo_rl/environments/penguin.py(1 hunks)pyproject.toml(3 hunks)tests/unit/environments/penguin_test_data/test_penguin_sanity.json(1 hunks)tests/unit/environments/test_penguin.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
**/*.py: Follow the Google Python Style Guide for all Python code
Target Python 3.12+ for all Python code in NeMo-RL
Indent Python code with 4 spaces; do not use tabs
Python filenames should be snake_case (e.g., some_file.py)
Class names should be PascalCase
Function and method names should be snake_case
Local variable names should be snake_case; if starting with a number, prefix with k (e.g., k_99th_percentile)
Global variables should be UPPER_SNAKE_CASE and prefixed with G_ (e.g., G_MY_GLOBAL)
Constants should be UPPER_SNAKE_CASE
Avoid shadowing variables declared in an outer scope
Initialize all externally visible members of a class in the constructor
For public interfaces used outside a file, prefer docstrings over comments
Use comments mainly for code within a function or interfaces local to a file
Commented-out code must include a nearby comment explaining usage and why it is commented out; otherwise remove before merging
Use Google-style docstrings for classes and functions (Sphinx-parseable)
Avoid using reflection when functionality can be easily achieved without it
Limit except clauses to the smallest specific set of exceptions possible
For duck-typing via try/except, keep the try body minimal and use else for main logic
Add the NVIDIA copyright header (with current year) at the top of all Python files, excluding tests/ and test-only scripts
Files:
nemo_rl/distributed/virtual_cluster.pynemo_rl/environments/penguin.py3rdparty/Penguin-workspace/is_penguin_installed.py3rdparty/Penguin-workspace/setup.pynemo_rl/distributed/ray_actor_environment_registry.pytests/unit/environments/test_penguin.py
nemo_rl/**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
nemo_rl/**/*.py: Do not set non-None configuration defaults in code; YAML is the single source of truth for defaults
Access required config attributes directly (e.g., policy_cfg["precision"]) and assume presence; do not introduce hidden defaults
Express configuration optionality via TypedDict using typing.NotRequired
When adding a new config key to a TypedDict subclass, document the key’s purpose, valid values/types, and recommended default in code
For any class or function decorated with @ray.remote, add '# pragma: no cover' on the class/def line (and on remote functions)
Files:
nemo_rl/distributed/virtual_cluster.pynemo_rl/environments/penguin.pynemo_rl/distributed/ray_actor_environment_registry.py
🧬 Code graph analysis (4)
nemo_rl/environments/penguin.py (4)
nemo_rl/data/interfaces.py (1)
DatumSpec(32-40)nemo_rl/distributed/virtual_cluster.py (3)
_get_free_port_local(76-84)_get_node_ip_local(69-73)shutdown(410-429)nemo_rl/environments/interfaces.py (1)
EnvironmentInterface(52-88)tests/unit/environments/test_penguin.py (1)
penguin(80-129)
3rdparty/Penguin-workspace/is_penguin_installed.py (1)
tests/unit/environments/test_penguin.py (1)
penguin(80-129)
nemo_rl/distributed/ray_actor_environment_registry.py (1)
nemo_rl/distributed/virtual_cluster.py (1)
PY_EXECUTABLES(42-61)
tests/unit/environments/test_penguin.py (3)
nemo_rl/distributed/ray_actor_environment_registry.py (1)
get_actor_python_env(49-64)nemo_rl/environments/penguin.py (6)
Penguin(32-163)PenguinConfig(25-28)setup_penguin_config(171-180)shutdown(154-155)health_check(94-95)run_rollouts(97-105)nemo_rl/models/generation/vllm/vllm_generation.py (1)
VllmGeneration(47-838)
🪛 Ruff (0.13.3)
nemo_rl/environments/penguin.py
73-73: Possible binding to all interfaces
(S104)
171-171: Unused function argument: tokenizer
(ARG001)
3rdparty/Penguin-workspace/is_penguin_installed.py
15-15: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
18-18: Do not catch blind exception: Exception
(BLE001)
3rdparty/Penguin-workspace/setup.py
51-53: Avoid specifying long messages outside the exception class
(TRY003)
tests/unit/environments/test_penguin.py
32-32: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
35-35: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
39-39: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
175-175: Ambiguous variable name: l
(E741)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Post submodule check comment / Comment on PR
- GitHub Check: Post automodel integration comment / Comment on PR
🔇 Additional comments (4)
tests/unit/environments/test_penguin.py (1)
144-178: LGTM! Well-structured sanity test with proper nondeterministic field handling.The test correctly aligns generation parameters, runs rollouts, and validates results while appropriately excluding nondeterministic fields (token_ids, generation_logprobs) from comparison. The standardization approach is sound.
nemo_rl/environments/penguin.py (3)
72-75: Binding to 0.0.0.0 is acceptable for internal Ray communication.Line 73 binds the head server to
0.0.0.0, which Ruff flags as a security concern (S104). However, this is acceptable for internal Ray actor communication within a controlled cluster environment.
119-125: Good assertion for token contiguity with clear diagnostic message.The assertion verifies that tokens are contiguous across messages and provides a detailed diagnostic message explaining potential causes (tokenization issues, history truncation). This is excellent for debugging.
189-202: LGTM! Appropriate conversion to DatumSpec format.The function correctly formats Penguin examples into NeMo RL's
DatumSpecstructure, including the fake user message and emptytoken_idslist for GRPO compatibility. The comments clearly explain the purpose of each field.
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Co-authored-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Co-authored-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Co-authored-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Co-authored-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Co-authored-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Co-authored-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Co-authored-by: Terry Kong <terryk@nvidia.com> Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Issues
List issues that this PR closes (syntax):
Usage
# Add a code snippet demonstrating how to use thisBefore your PR is "Ready for review"
Pre checks:
Additional Information
Summary by CodeRabbit