Skip to content

ENH enable sandbox env overrides in generate#1107

Merged
gwarmstrong merged 4 commits intomainfrom
georgea/enable-sandbox-env-overrides
Dec 12, 2025
Merged

ENH enable sandbox env overrides in generate#1107
gwarmstrong merged 4 commits intomainfrom
georgea/enable-sandbox-env-overrides

Conversation

@gwarmstrong
Copy link
Collaborator

@gwarmstrong gwarmstrong commented Dec 12, 2025

Summary by CodeRabbit

Release Notes

  • New Features

    • Added CLI option to supply custom environment variable overrides for sandbox configuration, enabling per-task environment customization.
  • Chores

    • Updated sandbox build installation logic to skip installation on ARM64 architecture and CI environments.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 12, 2025

📝 Walkthrough

Walkthrough

This pull request adds support for sandbox environment variable overrides in the pipeline generation feature and modifies the Dockerfile to skip uv and dependency installation on arm64 architectures and in CI environments.

Changes

Cohort / File(s) Summary
Build Configuration
dockerfiles/Dockerfile.sandbox
Modified installation guard for uv and dependencies to check both GITHUB_CI != "1" and TARGETARCH != "arm64" before execution, skipping installation in CI environments and on arm64 targets.
Sandbox Environment Overrides
nemo_skills/pipeline/generate.py
Added sandbox_env_overrides: Optional[List[str]] parameter to _create_commandgroup_from_config function to accept KEY=VALUE environment variable overrides. Extended generate CLI function with sandbox_env_overrides option and propagates overrides through to sandbox metadata environment during command group creation when with_sandbox is True.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

  • Verify that sandbox_env_overrides is correctly propagated through all relevant call sites of _create_commandgroup_from_config
  • Confirm the merge logic for environment overrides into sandbox metadata properly handles edge cases and doesn't inadvertently override critical defaults
  • Review the CLI parameter validation for the KEY=VALUE format specification

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: enabling sandbox environment overrides in the generate module, which aligns with the core functionality additions in both Dockerfile and generate.py.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch georgea/enable-sandbox-env-overrides

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
dockerfiles/Dockerfile.sandbox (1)

78-82: Verify the impact of skipping dependencies on arm64 architectures.

The conditional installation guard excludes uv and STEM dependencies (/app/stem_requirements.txt) on arm64 architectures. This is problematic because the stem_requirements.txt file contains 150+ critical packages including torch, numpy, scipy, pandas, tensorflow, and others. These packages are actively imported throughout the codebase (e.g., in nemo_skills/training/nemo_rl/average_checkpoints.py, nemo_skills/inference/retrieve_similar.py, nemo_skills/inference/eval/scicode_utils.py, and other modules) and are not optional. While the dockerfiles/README.md documents how to build for arm64 architectures, it does not explain or justify the exclusion of these packages. Code execution requests that depend on numpy, torch, scipy, or other STEM libraries will fail at runtime on arm64 platforms without graceful error handling or a documented fallback mechanism.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3d3963d and fdba8c2.

📒 Files selected for processing (2)
  • dockerfiles/Dockerfile.sandbox (1 hunks)
  • nemo_skills/pipeline/generate.py (4 hunks)
🧰 Additional context used
🪛 Ruff (0.14.8)
nemo_skills/pipeline/generate.py

254-258: Do not perform function call typer.Option in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: pre-commit
  • GitHub Check: unit-tests
🔇 Additional comments (3)
nemo_skills/pipeline/generate.py (3)

60-60: LGTM!

The parameter addition properly supports environment override propagation through the function chain.


254-258: LGTM!

The CLI parameter is correctly defined. The static analysis hint about typer.Option in function signatures is a false positive—this is the standard pattern for typer CLI applications.


472-472: LGTM!

The sandbox environment overrides are correctly passed to the command group creation function.

Comment on lines +126 to +132
# Apply user-specified environment overrides for the sandbox
if sandbox_env_overrides:
sandbox_env = metadata.get("environment", {})
for override in sandbox_env_overrides:
key, value = override.split("=", 1)
sandbox_env[key] = value
metadata["environment"] = sandbox_env
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Add error handling for malformed environment overrides.

Line 130 will raise ValueError if an override string doesn't contain "=". This would crash the pipeline with an unclear error message.

Apply this diff to add validation:

 # Apply user-specified environment overrides for the sandbox
 if sandbox_env_overrides:
     sandbox_env = metadata.get("environment", {})
     for override in sandbox_env_overrides:
+        if "=" not in override:
+            raise ValueError(
+                f"Invalid sandbox environment override format: '{override}'. "
+                f"Expected KEY=VALUE format (e.g., NEMO_SKILLS_SANDBOX_BLOCK_NETWORK=1)"
+            )
         key, value = override.split("=", 1)
         sandbox_env[key] = value
     metadata["environment"] = sandbox_env
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Apply user-specified environment overrides for the sandbox
if sandbox_env_overrides:
sandbox_env = metadata.get("environment", {})
for override in sandbox_env_overrides:
key, value = override.split("=", 1)
sandbox_env[key] = value
metadata["environment"] = sandbox_env
# Apply user-specified environment overrides for the sandbox
if sandbox_env_overrides:
sandbox_env = metadata.get("environment", {})
for override in sandbox_env_overrides:
if "=" not in override:
raise ValueError(
f"Invalid sandbox environment override format: '{override}'. "
f"Expected KEY=VALUE format (e.g., NEMO_SKILLS_SANDBOX_BLOCK_NETWORK=1)"
)
key, value = override.split("=", 1)
sandbox_env[key] = value
metadata["environment"] = sandbox_env
🤖 Prompt for AI Agents
In nemo_skills/pipeline/generate.py around lines 126 to 132, the loop that
parses sandbox_env_overrides uses override.split("=", 1) which will raise a
ValueError for strings without "="; update the code to validate each override
before splitting (e.g., check if "=" in override), and handle malformed entries
by either raising a clear ValueError with the offending string or
skipping/logging them—implement input validation, provide a descriptive error
message mentioning the bad override and that expected format is KEY=VALUE, and
ensure metadata["environment"] is only updated for valid pairs.

@gwarmstrong gwarmstrong merged commit c819ca4 into main Dec 12, 2025
5 checks passed
@gwarmstrong gwarmstrong deleted the georgea/enable-sandbox-env-overrides branch December 12, 2025 23:20
wasiahmad pushed a commit that referenced this pull request Dec 19, 2025
Signed-off-by: George Armstrong <georgea@nvidia.com>
wasiahmad pushed a commit that referenced this pull request Dec 19, 2025
Signed-off-by: George Armstrong <georgea@nvidia.com>

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
hsiehjackson pushed a commit that referenced this pull request Jan 13, 2026
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com>
wasiahmad pushed a commit that referenced this pull request Feb 4, 2026
Signed-off-by: George Armstrong <georgea@nvidia.com>
dgtm777 pushed a commit that referenced this pull request Mar 18, 2026
Signed-off-by: George Armstrong <georgea@nvidia.com>
dgtm777 pushed a commit that referenced this pull request Mar 18, 2026
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: dgitman <dgitman@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant