Skip to content

Sandbox add stem#1101

Merged
gwarmstrong merged 4 commits intomainfrom
sandbox-add-stem
Dec 12, 2025
Merged

Sandbox add stem#1101
gwarmstrong merged 4 commits intomainfrom
sandbox-add-stem

Conversation

@jiacheng-xu
Copy link
Collaborator

@jiacheng-xu jiacheng-xu commented Dec 12, 2025

remove open and ecdsa from requirements.

Summary by CodeRabbit

  • Chores
    • Updated sandbox environment to include STEM-related Python libraries for scientific computing, data analysis, and machine learning tasks.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Jiacheng Xu <jiachengx@nvidia.com>
Signed-off-by: Jiacheng Xu <jiachengx@nvidia.com>
Signed-off-by: Jiacheng Xu <jiachengx@nvidia.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 12, 2025

📝 Walkthrough

Walkthrough

Adds STEM-related library installation to the sandbox Dockerfile by copying a new requirements file, setting environment variables, and installing dependencies via pip. Introduces a new requirements/stem.txt file listing required Python packages including data science and machine learning libraries.

Changes

Cohort / File(s) Change Summary
Dockerfile installation workflow
dockerfiles/Dockerfile.sandbox
Adds installation steps: copies requirements/stem.txt to /app/stem_requirements.txt, sets pip environment variables, installs uv and upgrades pip, then installs dependencies from the requirements file
STEM dependencies
requirements/stem.txt
New file containing Python package dependencies for STEM applications (e.g., arxiv, beautifulsoup4, numpy, pandas, scipy, transformers, torch, and additional libraries)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Verify that all packages in requirements/stem.txt are required and compatible with each other
  • Confirm environment variable settings in Dockerfile (PIP_NO_BUILD_ISOLATION, PATH) are appropriate for the sandbox context
  • Check for any known package version conflicts or pinned versions if needed

Possibly related PRs

  • PR #1099 — Directly modifies the same files to add STEM dependency installation to the sandbox image; likely superseded by or related to this PR

Suggested reviewers

  • Kipok
  • gwarmstrong
  • ekmb

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Sandbox add stem' is concise and directly relates to the main changes—adding STEM-related libraries to the sandbox Dockerfile and their requirements file.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sandbox-add-stem

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
dockerfiles/Dockerfile.sandbox (1)

85-85: Use uv pip install for consistency with STEM requirements workflow.

Line 85 uses pip install gdown, but the new STEM installation workflow (line 80) uses uv pip install. This inconsistency could lead to:

  • Different dependency resolution between tools
  • Potential version conflicts if uv and pip resolve dependencies differently

Update line 85 to use the same tool:

-RUN mkdir -p /data && pip install gdown && \
+RUN mkdir -p /data && uv pip install gdown && \
🧹 Nitpick comments (2)
requirements/stem.txt (1)

1-199: Add version specifiers for stability and consistency.

Almost all packages lack version constraints. While Line 195 correctly specifies wikipedia>=1.4.0, most others have no version information, which can lead to:

  • Dependency conflicts between incompatible versions
  • Unexpected breakage when packages release major updates
  • Difficult reproducibility across environments

Consider adding version bounds (e.g., >=X.Y.Z, ~=X.Y) for key packages, or use a pinned version file for production reproducibility.

dockerfiles/Dockerfile.sandbox (1)

80-80: Add error handling for STEM requirements installation.

If the uv pip install -r /app/stem_requirements.txt command fails (line 80), the Docker build continues and produces an image with missing dependencies, leading to runtime failures that are difficult to debug. Consider failing fast on installation errors:

-RUN uv pip install -r /app/stem_requirements.txt
+RUN uv pip install -r /app/stem_requirements.txt || (echo "Failed to install STEM requirements" && exit 1)

Alternatively, use Docker's RUN implicit error handling by ensuring the last command in a compound statement is the critical one, or use set -e in shell scripts.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 699dfaa and 35cf7d5.

📒 Files selected for processing (2)
  • dockerfiles/Dockerfile.sandbox (1 hunks)
  • requirements/stem.txt (1 hunks)

cssselect
cvxopt
cvxpy
data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n requirements/stem.txt

Repository: NVIDIA-NeMo/Skills

Length of output: 3216


🏁 Script executed:

#!/bin/bash
# Check if packages exist on PyPI by attempting to get their metadata
# We'll use a simple approach: try to fetch package info

packages=("data" "grep" "HTMLParser" "typ" "utils")

for pkg in "${packages[@]}"; do
  echo "=== Checking package: $pkg ==="
  # Try to get package metadata using a web request (simulating pip search)
  curl -s "https://pypi.org/pypi/$pkg/json" 2>&1 | head -20
  echo ""
done

Repository: NVIDIA-NeMo/Skills

Length of output: 148


🏁 Script executed:

#!/bin/bash
# Check if HTMLParser is a built-in module and verify package suspicions

python3 <<'EOF'
import sys

# Check if HTMLParser is in standard library
print("=== Checking HTMLParser ===")
try:
    import HTMLParser
    print(f"HTMLParser import succeeded (Python {sys.version_info.major}.{sys.version_info.minor})")
except ImportError:
    print(f"HTMLParser not found as direct import")
    try:
        from html import parser
        print("Found in html.parser (standard library)")
    except ImportError:
        print("Not found in html.parser either")

# List known problematic packages
print("\n=== Package Name Analysis ===")
packages_to_check = {
    "data": "Extremely generic name - likely a placeholder",
    "grep": "Unix command-line tool, NOT a Python package",
    "HTMLParser": "Python standard library (html.parser module)",
    "typ": "Unclear - possibly typo for typing or typing_extensions",
    "utils": "Too generic - not a standard PyPI package"
}

for pkg, note in packages_to_check.items():
    print(f"{pkg}: {note}")

# Verify HTMLParser vs html.parser
print("\n=== Standard Library Check ===")
import html.parser
print("html.parser module confirmed in standard library")
EOF

Repository: NVIDIA-NeMo/Skills

Length of output: 549


Remove or fix invalid package entries.

Several package names are invalid and will cause installation failures:

  • Line 31: data — placeholder name, not a valid PyPI package
  • Line 58: grep — Unix command-line tool, not a Python package
  • Line 61: HTMLParser — Python standard library module (html.parser); must be removed
  • Line 189: typ — unclear/incomplete; likely a typo
  • Line 190: utils — too generic; not a valid PyPI package

Verify the intended packages and update accordingly, or remove these entries if they are not needed.

🤖 Prompt for AI Agents
In requirements/stem.txt around lines 31, 58, 61, 189–190: several invalid
placeholder/incorrect entries must be removed or replaced — remove the lone
"data" on line 31 or replace it with the actual package name intended; remove
"grep" on line 58 (it's not a PyPI package) or replace with the correct Python
package that provides the needed functionality; delete "HTMLParser" on line 61
(standard-library module) and, if an external package is required, add the
correct package name (e.g., html5lib or a backport) instead; fix or remove "typ"
on line 189 (likely a typo) and replace with the intended package name; remove
"utils" on line 190 or replace with the specific utility package intended. After
edits, run your dependency installation/validation (pip install -r or
pip-compile) to ensure no invalid packages remain.

Signed-off-by: George Armstrong <georgea@nvidia.com>
@gwarmstrong gwarmstrong merged commit 281c487 into main Dec 12, 2025
5 checks passed
@gwarmstrong gwarmstrong deleted the sandbox-add-stem branch December 12, 2025 00:57
wasiahmad pushed a commit that referenced this pull request Dec 12, 2025
Signed-off-by: Jiacheng Xu <jiachengx@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Co-authored-by: Jiacheng Xu <jiachengx@nvidia.com>
Co-authored-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
wasiahmad pushed a commit that referenced this pull request Dec 19, 2025
Signed-off-by: Jiacheng Xu <jiachengx@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Co-authored-by: Jiacheng Xu <jiachengx@nvidia.com>
Co-authored-by: George Armstrong <georgea@nvidia.com>
wasiahmad pushed a commit that referenced this pull request Dec 19, 2025
Signed-off-by: Jiacheng Xu <jiachengx@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Co-authored-by: Jiacheng Xu <jiachengx@nvidia.com>
Co-authored-by: George Armstrong <georgea@nvidia.com>

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
hsiehjackson pushed a commit that referenced this pull request Jan 13, 2026
Signed-off-by: Jiacheng Xu <jiachengx@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Co-authored-by: Jiacheng Xu <jiachengx@nvidia.com>
Co-authored-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com>
wasiahmad pushed a commit that referenced this pull request Feb 4, 2026
Signed-off-by: Jiacheng Xu <jiachengx@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Co-authored-by: Jiacheng Xu <jiachengx@nvidia.com>
Co-authored-by: George Armstrong <georgea@nvidia.com>
dgtm777 pushed a commit that referenced this pull request Mar 18, 2026
Signed-off-by: Jiacheng Xu <jiachengx@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Co-authored-by: Jiacheng Xu <jiachengx@nvidia.com>
Co-authored-by: George Armstrong <georgea@nvidia.com>
dgtm777 pushed a commit that referenced this pull request Mar 18, 2026
Signed-off-by: Jiacheng Xu <jiachengx@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Co-authored-by: Jiacheng Xu <jiachengx@nvidia.com>
Co-authored-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: dgitman <dgitman@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants