feat: migrate sandbox from uwsgi to gunicorn#1232
feat: migrate sandbox from uwsgi to gunicorn#1232gwarmstrong wants to merge 2 commits intoNVIDIA-NeMo:mainfrom
Conversation
… builds Add sandbox-core.lock and sandbox.lock for pinned dependency installation. Switch Dockerfile.sandbox from pip install to uv install from lockfiles. Add flask and gunicorn to code_execution.txt requirements. Signed-off-by: George Armstrong <georgea@nvidia.com>
Replace tiangolo/uwsgi-nginx-flask with python:3.10-slim and gunicorn. Add nginx as system package, remove all uwsgi configuration (UWSGI_PROCESSES, UWSGI_CHEAPER, UWSGI_CPU_AFFINITY, uwsgi_read_timeout). Add deprecation warnings for old uwsgi env vars in start-with-nginx.sh. Fix SIGINT handler in shell_worker for gunicorn compatibility. Update icpc/ioi dataset env vars to use GUNICORN_WORKERS. Signed-off-by: George Armstrong <georgea@nvidia.com>
9305cdf to
b5a8316
Compare
📝 WalkthroughWalkthroughThis pull request migrates the sandbox infrastructure from uWSGI to Gunicorn, updating the Docker build configuration to use a base Python image with lock-file-based dependency management, refactoring startup scripts to launch Gunicorn workers instead of uWSGI, and updating related configuration files across dataset and test modules. Changes
Sequence DiagramsequenceDiagram
participant Builder as Docker Builder
participant Nginx as Nginx (Reverse Proxy)
participant Health as Health Checker
participant Worker as Gunicorn Worker
participant App as Application
Builder->>Nginx: Configure with upstream ports
Builder->>Worker: Start worker with pre-check
Note over Worker: port_is_free()<br/>checks availability
alt Port Available
Worker->>Worker: Bind to worker port
Worker->>App: Launch Gunicorn process
App->>App: Start listening on port
else Port Conflict
Worker->>Worker: Retry with offset
end
Health->>Worker: Health check request
Worker-->>Health: Response (Gunicorn endpoint)
Health->>Nginx: Register upstream
Nginx->>Worker: Route requests
Worker->>App: Forward to application
App-->>Worker: Response
Worker-->>Nginx: Return to client
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| --backlog 100 \ | ||
| --access-logfile /var/log/worker${i}.log \ | ||
| --error-logfile /var/log/worker${i}.log \ | ||
| --log-level info \ |
There was a problem hiding this comment.
Trailing backslash creates invalid shell syntax
| --log-level info \ | |
| --log-level info |
| for retry in $(seq 0 $MAX_STARTUP_RETRIES); do | ||
| if port_is_free $port; then | ||
| break | ||
| fi | ||
| CONFLICT_COUNT=$((CONFLICT_COUNT + 1)) | ||
| port=$((port + PORT_INCREMENT)) | ||
| echo "[$_H] Port conflict on worker $i, trying port $port" | ||
| done |
There was a problem hiding this comment.
Port conflict retry exhaustion will spawn worker on occupied port. If all MAX_STARTUP_RETRIES iterations fail to find a free port, the loop exits without breaking, and line 434 spawns the worker with an occupied port, causing it to fail. Add a check after the loop:
if ! port_is_free $port; then
echo "[$_H] ERROR: Worker $i could not find a free port after $MAX_STARTUP_RETRIES retries"
cleanup
exit 1
fi
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@dockerfiles/Dockerfile.sandbox`:
- Around line 34-36: Update the RUN apt-get install invocation to avoid
installing recommended packages and drop the unused net-tools package: modify
the RUN command that performs apt-get update && apt-get install -y nginx curl
git net-tools bzip2 build-essential libseccomp-dev && ... so it adds the
--no-install-recommends flag to apt-get install and removes net-tools from the
package list; keep the existing cleanup (rm -rf /var/lib/apt/lists/*
/var/cache/apt/archives/*) unchanged.
In `@dockerfiles/sandbox/start-with-nginx.sh`:
- Around line 134-154: The bare redirection in start_worker_fast (">
/var/log/worker${i}.log") should be replaced with a portable no-op truncation;
change that line to use the shell builtin no-op with redirection (e.g., ": >
/var/log/worker${i}.log") so the log file is truncated safely and ShellCheck
SC2188 is resolved while keeping the rest of start_worker_fast (gunicorn
invocation, redirects, and echo "$!:$WORKER_PORT") unchanged.
In `@nemo_skills/dataset/icpc/__init__.py`:
- Around line 25-29: The SANDBOX_ENV_VARS change replaced uWSGI's lazy-scaling
UWSGI_CHEAPER with a fixed "GUNICORN_WORKERS=1024", which forces 1024 processes
at startup; update the SANDBOX_ENV_VARS constant to avoid spawning all workers
immediately by either (a) restoring a uWSGI lazy-scaling env var (e.g. set/keep
UWSGI_CHEAPER instead of GUNICORN_WORKERS) or (b) switch Gunicorn to a
threaded/greenlet worker model and lower process count: replace
"GUNICORN_WORKERS=1024" with a conservative process count (e.g. NUM_WORKERS
small), add "GUNICORN_WORKER_CLASS=gthread" or "GUNICORN_WORKER_CLASS=gevent"
and add a "GUNICORN_THREADS=<desired_thread_count>" env var to maintain
concurrency without pre-forking 1024 processes; apply the same change in
nemo_skills/dataset/ioi/__init__.py and ensure code is compatible with the
chosen worker_class before merging.
In `@tests/test_sandbox_network_blocking.py`:
- Around line 41-42: The test currently calls get_free_port twice with
strategy="random" for variables port and worker_base_port which can collide;
call get_free_port the second time with exclude containing the first port (use
get_free_port(..., exclude=[port])) so worker_base_port is chosen distinct from
port (refer to variables port and worker_base_port and the get_free_port
function's exclude parameter).
🧹 Nitpick comments (2)
requirements/code_execution.txt (1)
16-19: Infrastructure dependencies mixed with LLM-facing packages.The file header (Line 16) says these are "requirements for code execution - put all packages that LLM should have access to in here," but
flaskandgunicornare infrastructure/server dependencies, not packages the LLM needs. Consider either updating the comment to reflect the dual purpose, or splitting server dependencies into a separate requirements file so the lock-file compilation source is clearer.dockerfiles/sandbox/start-with-nginx.sh (1)
421-437: Unused loop variableretry(ShellCheck SC2034).The variable
retryon Line 425 is only used to control iteration count, never referenced in the loop body. This is harmless but flagged by ShellCheck. You can suppress it with a leading underscore convention or just use_.Proposed fix
- for retry in $(seq 0 $MAX_STARTUP_RETRIES); do + for _retry in $(seq 0 $MAX_STARTUP_RETRIES); do
| RUN apt-get update && \ | ||
| apt-get install -y curl git net-tools bzip2 build-essential libseccomp-dev && \ | ||
| ARCH="${TARGETARCH:-$(dpkg --print-architecture)}" && \ | ||
| apt-get install -y nginx curl git net-tools bzip2 build-essential libseccomp-dev && \ | ||
| rm -rf /var/lib/apt/lists/* /var/cache/apt/archives/* |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if net-tools or its commands (netstat, ifconfig, arp, route) are referenced anywhere
echo "=== Searching for net-tools commands usage ==="
rg -i '\b(netstat|ifconfig|arp|route)\b' --type=sh --type=py --type=bash 2>/dev/null || echo "No matches found"
echo ""
echo "=== Searching for net-tools package reference ==="
rg 'net-tools' 2>/dev/null || echo "No matches found"
echo ""
echo "=== Checking the dockerfile for net-tools and checking startup script ==="
cd dockerfiles && ls -la | head -20Repository: NVIDIA-NeMo/Skills
Length of output: 1140
🏁 Script executed:
#!/bin/bash
# Read the Dockerfile to see the full context and startup script reference
echo "=== Reading Dockerfile.sandbox ==="
cat dockerfiles/Dockerfile.sandbox | head -160
echo ""
echo "=== Looking for the start-with-nginx.sh script ==="
find . -name "start-with-nginx.sh" -o -name "start.sh" | head -10Repository: NVIDIA-NeMo/Skills
Length of output: 7511
🏁 Script executed:
#!/bin/bash
# Check what's in the startup scripts to see if net-tools commands are used
echo "=== Searching for startup script files ==="
find . -type f -name "*start*" | grep -E "\.(sh|py)$" | head -20Repository: NVIDIA-NeMo/Skills
Length of output: 340
🏁 Script executed:
#!/bin/bash
# Check the startup script for command usage
echo "=== Checking start-with-nginx.sh for command usage ==="
cat dockerfiles/sandbox/start-with-nginx.shRepository: NVIDIA-NeMo/Skills
Length of output: 25914
🏁 Script executed:
#!/bin/bash
# Verify iproute2 is included in python:3.10-slim by checking Debian package info
echo "=== Checking if ss (from iproute2) is typically available in Debian slim images ==="
echo "iproute2 is part of the essential packages in Debian and is included in slim images"
echo "Python:3.10-slim is based on Debian 12 (bookworm), which includes iproute2 by default"
echo ""
echo "=== Verification Summary ==="
echo "1. ss command usage found: YES (port_is_free function uses 'ss -tln')"
echo "2. net-tools commands found: NO (no netstat, ifconfig, arp, route usage)"
echo "3. net-tools package reference: ONLY in the apt-get install line (line 35)"
echo "4. iproute2 in slim images: YES (included by default in Debian slim)"Repository: NVIDIA-NeMo/Skills
Length of output: 632
Add --no-install-recommends and remove unused net-tools package.
The apt-get install on line 35 pulls in unnecessary recommended packages (Trivy DS-0029) and includes net-tools, which is not used. The scripts use ss (from iproute2, pre-installed in python:3.10-slim) rather than tools from net-tools.
Proposed fix
RUN apt-get update && \
- apt-get install -y nginx curl git net-tools bzip2 build-essential libseccomp-dev && \
+ apt-get install -y --no-install-recommends nginx curl git bzip2 build-essential libseccomp-dev && \
rm -rf /var/lib/apt/lists/* /var/cache/apt/archives/*📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| RUN apt-get update && \ | |
| apt-get install -y curl git net-tools bzip2 build-essential libseccomp-dev && \ | |
| ARCH="${TARGETARCH:-$(dpkg --print-architecture)}" && \ | |
| apt-get install -y nginx curl git net-tools bzip2 build-essential libseccomp-dev && \ | |
| rm -rf /var/lib/apt/lists/* /var/cache/apt/archives/* | |
| RUN apt-get update && \ | |
| apt-get install -y --no-install-recommends nginx curl git bzip2 build-essential libseccomp-dev && \ | |
| rm -rf /var/lib/apt/lists/* /var/cache/apt/archives/* |
🧰 Tools
🪛 Trivy (0.69.1)
[error] 34-36: 'apt-get' missing '--no-install-recommends'
'--no-install-recommends' flag is missed: 'apt-get update && apt-get install -y nginx curl git net-tools bzip2 build-essential libseccomp-dev && rm -rf /var/lib/apt/lists/* /var/cache/apt/archives/*'
Rule: DS-0029
(IaC/Dockerfile)
🤖 Prompt for AI Agents
In `@dockerfiles/Dockerfile.sandbox` around lines 34 - 36, Update the RUN apt-get
install invocation to avoid installing recommended packages and drop the unused
net-tools package: modify the RUN command that performs apt-get update &&
apt-get install -y nginx curl git net-tools bzip2 build-essential libseccomp-dev
&& ... so it adds the --no-install-recommends flag to apt-get install and
removes net-tools from the package list; keep the existing cleanup (rm -rf
/var/lib/apt/lists/* /var/cache/apt/archives/*) unchanged.
| start_worker_fast() { | ||
| local i=$1 | ||
| local WORKER_PORT=$2 | ||
|
|
||
| cat > /tmp/worker${i}_uwsgi.ini << EOF | ||
| [uwsgi] | ||
| module = main | ||
| callable = app | ||
| processes = ${UWSGI_PROCESSES} | ||
| http-socket = 0.0.0.0:${WORKER_PORT} | ||
| vacuum = true | ||
| master = true | ||
| die-on-term = true | ||
| memory-report = true | ||
| listen = 100 | ||
| http-timeout = 300 | ||
| socket-timeout = 300 | ||
| disable-logging = false | ||
| log-date = true | ||
| log-prefix = [worker${i}] | ||
| logto = /var/log/worker${i}.log | ||
| EOF | ||
|
|
||
| if [ -n "$UWSGI_CHEAPER" ]; then | ||
| echo "cheaper = ${UWSGI_CHEAPER}" >> /tmp/worker${i}_uwsgi.ini | ||
| fi | ||
|
|
||
| > /var/log/worker${i}.log | ||
| ( cd /app && env WORKER_NUM=$i uwsgi --ini /tmp/worker${i}_uwsgi.ini ) & | ||
| # Redirect stdout/stderr to the log file so the $() subshell that calls | ||
| # this function doesn't block waiting for the pipe to close. | ||
| # (uWSGI's --logto closed stdout implicitly; gunicorn does not.) | ||
| ( cd /app && env WORKER_NUM=$i \ | ||
| gunicorn main:app \ | ||
| --workers ${GUNICORN_WORKERS} \ | ||
| --bind 0.0.0.0:${WORKER_PORT} \ | ||
| --timeout 300 \ | ||
| --graceful-timeout 30 \ | ||
| --backlog 100 \ | ||
| --access-logfile /var/log/worker${i}.log \ | ||
| --error-logfile /var/log/worker${i}.log \ | ||
| --log-level info \ | ||
| ) >> /var/log/worker${i}.log 2>&1 & | ||
| echo "$!:$WORKER_PORT" | ||
| } |
There was a problem hiding this comment.
ShellCheck SC2188: redirection without a command on Line 138.
> /var/log/worker${i}.log is a bare redirection. While it works in bash for truncating, it's flagged by ShellCheck and not portable. Use : > (the no-op builtin) instead.
Proposed fix
- > /var/log/worker${i}.log
+ : > /var/log/worker${i}.log📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| start_worker_fast() { | |
| local i=$1 | |
| local WORKER_PORT=$2 | |
| cat > /tmp/worker${i}_uwsgi.ini << EOF | |
| [uwsgi] | |
| module = main | |
| callable = app | |
| processes = ${UWSGI_PROCESSES} | |
| http-socket = 0.0.0.0:${WORKER_PORT} | |
| vacuum = true | |
| master = true | |
| die-on-term = true | |
| memory-report = true | |
| listen = 100 | |
| http-timeout = 300 | |
| socket-timeout = 300 | |
| disable-logging = false | |
| log-date = true | |
| log-prefix = [worker${i}] | |
| logto = /var/log/worker${i}.log | |
| EOF | |
| if [ -n "$UWSGI_CHEAPER" ]; then | |
| echo "cheaper = ${UWSGI_CHEAPER}" >> /tmp/worker${i}_uwsgi.ini | |
| fi | |
| > /var/log/worker${i}.log | |
| ( cd /app && env WORKER_NUM=$i uwsgi --ini /tmp/worker${i}_uwsgi.ini ) & | |
| # Redirect stdout/stderr to the log file so the $() subshell that calls | |
| # this function doesn't block waiting for the pipe to close. | |
| # (uWSGI's --logto closed stdout implicitly; gunicorn does not.) | |
| ( cd /app && env WORKER_NUM=$i \ | |
| gunicorn main:app \ | |
| --workers ${GUNICORN_WORKERS} \ | |
| --bind 0.0.0.0:${WORKER_PORT} \ | |
| --timeout 300 \ | |
| --graceful-timeout 30 \ | |
| --backlog 100 \ | |
| --access-logfile /var/log/worker${i}.log \ | |
| --error-logfile /var/log/worker${i}.log \ | |
| --log-level info \ | |
| ) >> /var/log/worker${i}.log 2>&1 & | |
| echo "$!:$WORKER_PORT" | |
| } | |
| start_worker_fast() { | |
| local i=$1 | |
| local WORKER_PORT=$2 | |
| : > /var/log/worker${i}.log | |
| # Redirect stdout/stderr to the log file so the $() subshell that calls | |
| # this function doesn't block waiting for the pipe to close. | |
| # (uWSGI's --logto closed stdout implicitly; gunicorn does not.) | |
| ( cd /app && env WORKER_NUM=$i \ | |
| gunicorn main:app \ | |
| --workers ${GUNICORN_WORKERS} \ | |
| --bind 0.0.0.0:${WORKER_PORT} \ | |
| --timeout 300 \ | |
| --graceful-timeout 30 \ | |
| --backlog 100 \ | |
| --access-logfile /var/log/worker${i}.log \ | |
| --error-logfile /var/log/worker${i}.log \ | |
| --log-level info \ | |
| ) >> /var/log/worker${i}.log 2>&1 & | |
| echo "$!:$WORKER_PORT" | |
| } |
🧰 Tools
🪛 Shellcheck (0.11.0)
[warning] 138-138: This redirection doesn't have a command. Move to its command (or use 'true' as no-op).
(SC2188)
🤖 Prompt for AI Agents
In `@dockerfiles/sandbox/start-with-nginx.sh` around lines 134 - 154, The bare
redirection in start_worker_fast ("> /var/log/worker${i}.log") should be
replaced with a portable no-op truncation; change that line to use the shell
builtin no-op with redirection (e.g., ": > /var/log/worker${i}.log") so the log
file is truncated safely and ShellCheck SC2188 is resolved while keeping the
rest of start_worker_fast (gunicorn invocation, redirects, and echo
"$!:$WORKER_PORT") unchanged.
| SANDBOX_ENV_VARS = [ | ||
| "UWSGI_PROCESSES=1024", | ||
| "UWSGI_CPU_AFFINITY=8", | ||
| "UWSGI_CHEAPER=1023", | ||
| "GUNICORN_WORKERS=1024", | ||
| "NUM_WORKERS=1", | ||
| "STATEFUL_SANDBOX=0", | ||
| ] |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check the current code in icpc/__init__.py
cat -n nemo_skills/dataset/icpc/__init__.py | head -40Repository: NVIDIA-NeMo/Skills
Length of output: 1339
🏁 Script executed:
# Check the equivalent file in ioi/__init__.py
cat -n nemo_skills/dataset/ioi/__init__.py | head -40Repository: NVIDIA-NeMo/Skills
Length of output: 1208
🌐 Web query:
Gunicorn lazy worker spawning vs immediate spawn all workers at startup
💡 Result:
Gunicorn does not have a “lazy / on-demand worker spawning” mode in the sense of only creating workers when traffic arrives. Gunicorn uses a pre-fork worker model: a master (arbiter) process starts and manages a fixed pool of worker processes, and those workers handle requests. This model is explicitly “pre-fork” (workers exist before requests are handled). [1]
What can vary: when the application code is loaded
Gunicorn gives you two practical “startup behaviors,” but both still spawn the workers at startup:
- Default (no
--preload)
- Gunicorn starts the master, spawns the configured workers, and each worker loads/imports your app itself (so app code is not shared between workers).
- Benefit: easier code reload behavior (restart workers to pick up code changes). [2]
- Preload mode (
--preload/preload_app=True)
- Gunicorn loads the application code in the master first, then forks workers so they can share read-only memory pages via copy-on-write (often lower RAM and faster boot). [2]
“Spawn more workers later” (not traffic-driven)
Gunicorn can change the worker count at runtime via signals:
TTINincreases worker count by 1TTOUdecreases worker count by 1 [3]
But this is an operator-driven scaling mechanism, not automatic “lazy spawn on demand.”
Sources: Gunicorn design (pre-fork model) [1], preload_app/--preload setting docs [2], Gunicorn signal handling (TTIN/TTOU) [3].
🌐 Web query:
Gunicorn worker class alternatives to sync worker for lazy spawning
💡 Result:
Gunicorn doesn’t “lazy spawn” workers on demand—its core architecture is a pre-fork worker model where the master (arbiter) maintains a worker pool at the configured size. (gunicorn.org)
So there isn’t a worker class that changes that fundamental behavior; the practical alternative is to reduce process count and get concurrency inside each worker.
Worker-class alternatives to sync (to avoid needing many pre-forked processes)
-
gthread(threaded workers): run fewer worker processes and add a thread pool per worker (--threads). Also supports keep-alive (unlikesync). (docs.gunicorn.org)
Example:gunicorn myapp:app -k gthread --workers 2 --threads 8
-
gevent(greenlets / async I/O): good for I/O-bound workloads and many concurrent connections; uses--worker-connections. (gunicorn.org)
Example:gunicorn myapp:app -k gevent --workers 2 --worker-connections 1000
-
asgi(native Gunicorn ASGI worker, Gunicorn ≥ 24.0.0, beta): for async frameworks; provides keep-alive + WebSockets. (gunicorn.org)
Example:gunicorn myasgi:app --worker-class asgi --workers 2
-
eventlet: listed but deprecated in current Gunicorn docs—prefergevent. (gunicorn.org) -
tornado: primarily for Tornado apps (WSGI under it isn’t generally recommended). (docs.gunicorn.org)
If your goal is “spin up capacity only when load arrives”
Gunicorn itself won’t do that automatically, but you can change worker count at runtime via signals (e.g., add workers with TTIN, remove with TTOU). (gunicorn.org)
If you tell me whether you’re serving WSGI vs ASGI, and whether the bottleneck is CPU-bound vs I/O-bound, I can suggest a concrete configuration (workers/threads/worker-connections/timeouts) that best approximates “lazy capacity” without over-forking.
Citations:
- 1: https://gunicorn.org/design/?utm_source=openai
- 2: https://docs.gunicorn.org/en/20.0.4/settings.html?utm_source=openai
- 3: https://gunicorn.org/design/?utm_source=openai
- 4: https://gunicorn.org/asgi/?utm_source=openai
- 5: https://gunicorn.org/design/?utm_source=openai
- 6: https://docs.gunicorn.org/en/21.1.0/design.html?utm_source=openai
- 7: https://gunicorn.org/design/?utm_source=openai
Loss of lazy worker spawning (uWSGI cheaper subsystem).
The configuration changed from uWSGI's UWSGI_CHEAPER=1023 (which spawned 1 worker and scaled to 1024 on demand) to Gunicorn's --workers 1024 (all 1024 workers spawned immediately at startup). Gunicorn does not support lazy/on-demand worker spawning; its pre-fork model always spawns the full worker pool at boot, causing significantly higher startup memory pressure for ICPC/IOI benchmarks.
The --preload flag reduces per-worker memory via copy-on-write but does not delay worker spawning. Alternative worker classes (gthread, gevent) reduce process count by adding threading/greenlet concurrency per worker, but require ensuring the code is compatible. Consider whether the current configuration is acceptable or if an alternative worker class would be appropriate.
This same issue applies to nemo_skills/dataset/ioi/__init__.py.
🤖 Prompt for AI Agents
In `@nemo_skills/dataset/icpc/__init__.py` around lines 25 - 29, The
SANDBOX_ENV_VARS change replaced uWSGI's lazy-scaling UWSGI_CHEAPER with a fixed
"GUNICORN_WORKERS=1024", which forces 1024 processes at startup; update the
SANDBOX_ENV_VARS constant to avoid spawning all workers immediately by either
(a) restoring a uWSGI lazy-scaling env var (e.g. set/keep UWSGI_CHEAPER instead
of GUNICORN_WORKERS) or (b) switch Gunicorn to a threaded/greenlet worker model
and lower process count: replace "GUNICORN_WORKERS=1024" with a conservative
process count (e.g. NUM_WORKERS small), add "GUNICORN_WORKER_CLASS=gthread" or
"GUNICORN_WORKER_CLASS=gevent" and add a
"GUNICORN_THREADS=<desired_thread_count>" env var to maintain concurrency
without pre-forking 1024 processes; apply the same change in
nemo_skills/dataset/ioi/__init__.py and ensure code is compatible with the
chosen worker_class before merging.
| port = get_free_port(strategy="random") | ||
| worker_base_port = get_free_port(strategy="random") |
There was a problem hiding this comment.
Minor: potential port collision between port and worker_base_port.
Both calls use strategy="random" independently. get_free_port already supports an exclude parameter to avoid this.
Proposed fix
port = get_free_port(strategy="random")
- worker_base_port = get_free_port(strategy="random")
+ worker_base_port = get_free_port(exclude=[port], strategy="random")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| port = get_free_port(strategy="random") | |
| worker_base_port = get_free_port(strategy="random") | |
| port = get_free_port(strategy="random") | |
| worker_base_port = get_free_port(exclude=[port], strategy="random") |
🤖 Prompt for AI Agents
In `@tests/test_sandbox_network_blocking.py` around lines 41 - 42, The test
currently calls get_free_port twice with strategy="random" for variables port
and worker_base_port which can collide; call get_free_port the second time with
exclude containing the first port (use get_free_port(..., exclude=[port])) so
worker_base_port is chosen distinct from port (refer to variables port and
worker_base_port and the get_free_port function's exclude parameter).
|
Recreating from origin branch to fix CI auth |
Summary
tiangolo/uwsgi-nginx-flaskbase image withpython:3.10-slim+ nginx + gunicornstart-with-nginx.shshell_workerfor gunicorn compatibilityicpc/ioidataset env vars to useGUNICORN_WORKERSsys.executableusageDepends on #1231 (lockfiles). Split from #1226.
Summary by CodeRabbit
Release Notes
Chores
Bug Fixes