fix(studio): harden sandbox security for terminal and python tools#4827
Conversation
The existing command blocklist used naive str.split() which is trivially bypassable via quoting, full paths, nested shells, variable expansion, and cross-tool pivoting through Python os.system/subprocess. Fixes #4818. Changes: - Replace str.split() blocklist with shlex.split() + os.path.basename() tokenization and regex scanning at shell command boundaries - Add sanitized subprocess environment (_build_safe_env) that strips credentials (HF_TOKEN, WANDB_API_KEY, GH_TOKEN, AWS_*, etc.) and restricts PATH to /usr/local/bin:/usr/bin:/bin - Add PR_SET_NO_NEW_PRIVS via prctl on Linux so sudo/su/pkexec fail at the kernel level regardless of how they are invoked - Add RLIMIT_NPROC (256) and RLIMIT_FSIZE (100MB) to prevent fork bombs and disk filling attacks - Extend AST safety checker to detect os.system(), os.popen(), subprocess.run/Popen/call/check_output, os.exec*, os.spawn* calls containing blocked commands or dynamic (non-literal) arguments - Add cross-platform support: cmd.exe on Windows, bash on Unix; CREATE_NO_WINDOW flag on Windows, preexec_fn on Unix - Expand blocklist from 7 to 14 commands: add su, chown, passwd, mount, umount, fdisk, kill, killall, pkill - Apply all layers to both _bash_exec and _python_exec Zero measurable performance overhead -- shlex parsing and a single prctl syscall per subprocess fork.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Code Review
This pull request significantly strengthens the security of the code execution sandbox by implementing robust command filtering via shlex and regex, stripping sensitive environment variables, and applying kernel-level privilege restrictions and resource limits. It also introduces AST-based detection of dangerous os and subprocess functions to prevent shell escapes. Feedback suggests expanding the blocked function list to include os.posix_spawn variants and using dynamic library discovery for libc to improve portability across different Linux distributions.
I am having trouble creating individual review comments. Click here to see my feedback.
studio/backend/core/inference/tools.py (553-562)
The _SHELL_EXEC_FUNCS list is missing os.posix_spawn and os.posix_spawnp, which can also be used to execute external commands and should be included in the security blocklist for consistency.
studio/backend/core/inference/tools.py (115)
Hardcoding the path to 'libc.so.6' is fragile and may fail on systems where libc is named differently (e.g., musl-based systems like Alpine Linux). Consider using ctypes.util.find_library('c') to locate the standard C library dynamically.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: df7d0a5746
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| item.get("description", "") for item in info.get("shell_escapes", []) | ||
| ] | ||
| all_reasons = [r for r in reasons + shell_reasons if r] | ||
| if all_reasons: | ||
| return ( |
There was a problem hiding this comment.
Include exception-catching findings in unsafe code rejection
_check_signal_escape_patterns() still marks loop-level except Exception patterns as unsafe, but _check_code_safety() now builds all_reasons from only signal_tampering and shell_escapes. When only exception_catching is present, all_reasons is empty and the function returns None, so code that was previously rejected is now allowed (e.g., infinite-loop patterns guarded by broad exception handlers).
Useful? React with 👍 / 👎.
| if isinstance(func, ast.Attribute): | ||
| if isinstance(func.value, ast.Name): | ||
| if func.value.id in self.os_aliases: | ||
| shell_func = f"os.{func.attr}" | ||
| elif func.value.id in self.subprocess_aliases: |
There was a problem hiding this comment.
Handle subprocess functions imported with from-import
Shell-escape detection only resolves calls where node.func is an attribute (subprocess.run), so from subprocess import run followed by run([...]) is not checked at all. This bypasses the new Python-side blocked-command filter and still allows dangerous commands through bare-name calls.
Useful? React with 👍 / 👎.
| for s in strs: | ||
| base = os.path.basename(s).lower() | ||
| if base in _BLOCKED_COMMANDS: | ||
| found.add(base) |
There was a problem hiding this comment.
Tokenize list elements before blocked-command matching
In _check_args_for_blocked, list arguments are validated by os.path.basename() equality only, so strings like "rm -rf x" inside a list are not parsed and are treated as safe. A call such as subprocess.run(["bash", "-c", "rm -rf x"]) therefore evades detection even though it executes a blocked command via a nested shell.
Useful? React with 👍 / 👎.
…rocess substitution - Include exception_catching reasons in _check_code_safety so bare except-in-loop timeout evasion is actually blocked (was computed in _check_signal_escape_patterns but never read by the caller) - Remove base.split() inner loop that caused false positives on quoted text arguments containing blocked words (e.g. echo "kill this process") - Add targeted nested shell detection for bash/sh/zsh -c arguments instead, which catches bash -c 'sudo whoami' without false positives - Add <() process substitution to the regex character class so diff <(rm -rf /path) is also caught - Fix error message to say "unsafe patterns" instead of specifically mentioning signal manipulation when other categories trigger
for more information, see https://pre-commit.ci
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request significantly enhances the security of the code execution sandbox by introducing robust blocked command detection using shlex and regex, environment variable stripping, and resource limits via pre-execution hooks. It also extends the AST-based safety checker to flag dangerous os and subprocess calls. Feedback focuses on addressing critical bypass vectors, including the need to inspect keyword arguments in the AST visitor, ensuring recursive literal checks for list elements, and updating regex patterns to account for absolute or relative paths to blocked binaries.
| blocked_in_args = _check_args_for_blocked(node.args) | ||
| if blocked_in_args: | ||
| shell_escapes.append( | ||
| { | ||
| "type": "shell_escape", | ||
| "line": node.lineno, | ||
| "description": ( | ||
| f"{shell_func}() invokes blocked command(s): " | ||
| f"{', '.join(sorted(blocked_in_args))}" | ||
| ), | ||
| } | ||
| ) | ||
| else: | ||
| has_non_literal = any( | ||
| _extract_string_from_node(a) is None | ||
| and not isinstance(a, ast.List) | ||
| for a in node.args | ||
| ) |
There was a problem hiding this comment.
The safety check for dynamic arguments and blocked commands in os/subprocess calls has two significant bypass vectors: 1. Keyword Argument Bypass: The check only inspects positional arguments (node.args), ignoring keyword arguments. 2. Dynamic List Bypass: The has_non_literal check explicitly ignores ast.List nodes, but does not verify that the contents of the list are literals.
if shell_func and shell_func in _SHELL_EXEC_FUNCS:
all_args = list(node.args) + [kw.value for kw in node.keywords]
blocked_in_args = _check_args_for_blocked(all_args)
if blocked_in_args:
shell_escapes.append(
{
"type": "shell_escape",
"line": node.lineno,
"description": (
f"{shell_func}() invokes blocked command(s): "
f"{', '.join(sorted(blocked_in_args))}"
),
}
)
else:
def _is_literal(n):
if _extract_string_from_node(n) is not None:
return True
if isinstance(n, ast.List):
return all(_extract_string_from_node(e) is not None for e in n.elts)
return False
has_non_literal = any(not _is_literal(a) for a in all_args)
if has_non_literal:
shell_escapes.append(
{
"type": "shell_escape_dynamic",
"line": node.lineno,
"description": (
f"{shell_func}() called with non-literal argument "
f"(potential shell escape)"
),
}
)| # (semicolons, pipes, &&, ||, backticks, $(), <(), newlines) | ||
| lowered = command.lower() | ||
| for word in _BLOCKED_COMMANDS: | ||
| pattern = rf"(?:^|[;&|`\n]\s*|[$]\(\s*|<\(\s*){re.escape(word)}\b" |
There was a problem hiding this comment.
The regex pattern for detecting blocked commands at shell boundaries does not account for absolute or relative paths (e.g., ls; /usr/bin/sudo). This allows users to bypass the blocklist by specifying the full path to a binary when using shell metacharacters.
| pattern = rf"(?:^|[;&|`\n]\s*|[$]\(\s*|<\(\s*){re.escape(word)}\b" | |
| pattern = rf"(?:^|[;&|`\n]\s*|[$]\(\s*|<\(\s*)(?:[\w./\\-]*[/\\])?{re.escape(word)}\b" |
| for word in tokens[j].lower().split(): | ||
| if word in _BLOCKED_COMMANDS: | ||
| blocked.add(word) |
There was a problem hiding this comment.
The nested shell detection logic uses a simple split() and direct membership test, which fails to catch blocked commands invoked via paths (e.g., bash -c "/bin/rm -rf /"). It should use os.path.basename to normalize the command name before checking against the blocklist.
| for word in tokens[j].lower().split(): | |
| if word in _BLOCKED_COMMANDS: | |
| blocked.add(word) | |
| for word in tokens[j].lower().split(): | |
| base = os.path.basename(word) | |
| if base in _BLOCKED_COMMANDS: | |
| blocked.add(base) |
| base = os.path.basename(s).lower() | ||
| if base in _BLOCKED_COMMANDS: | ||
| found.add(base) |
There was a problem hiding this comment.
The logic for checking list elements in subprocess calls is less robust than the one used for string arguments. It only checks the basename of each element, missing blocked commands embedded in shell-like strings (e.g., subprocess.run(["bash", "-c", "sudo whoami"])). Using _find_blocked_commands for list elements would provide consistent protection across both positional and list-based tool invocations.
| base = os.path.basename(s).lower() | |
| if base in _BLOCKED_COMMANDS: | |
| found.add(base) | |
| found |= _find_blocked_commands(s) |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 06531267b4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| shell_func = f"subprocess.{func.attr}" | ||
|
|
||
| if shell_func and shell_func in _SHELL_EXEC_FUNCS: | ||
| blocked_in_args = _check_args_for_blocked(node.args) |
There was a problem hiding this comment.
Inspect keyword arguments for subprocess command strings
The shell-escape checker only analyzes node.args, so subprocess calls that provide the command via keyword arguments bypass safety checks entirely. In this path, subprocess.run(args="rm -rf /tmp/x", shell=True) (or args=cmd) produces no blocked_in_args and no dynamic-arg warning, allowing blocked commands through _check_code_safety in the Python tool.
Useful? React with 👍 / 👎.
| has_non_literal = any( | ||
| _extract_string_from_node(a) is None | ||
| and not isinstance(a, ast.List) | ||
| for a in node.args |
There was a problem hiding this comment.
Flag non-literal list elements as dynamic command input
The dynamic-argument check explicitly excludes list arguments, but list inspection only reads constant string elements. As a result, a call like cmd = "rm"; subprocess.run([cmd, "-rf", target]) is treated as safe: the variable command element is neither token-checked nor marked dynamic, so blocked executables can still be launched through the Python sandbox.
Useful? React with 👍 / 👎.
…ning - Regex now matches blocked commands after optional path prefix at shell boundaries (catches ls; /usr/bin/sudo and similar) - Nested shell detection uses os.path.basename so bash -c "/bin/rm" is caught - AST checker now inspects keyword arguments (not just positional) so subprocess.run(args="sudo ...", shell=True) is detected - List elements in subprocess calls are now checked via _find_blocked_commands for consistency (catches subprocess.run(["bash", "-c", "rm -rf /"])) - Dynamic argument check uses _is_safe_literal that validates list contents are all string literals
for more information, see https://pre-commit.ci
|
/gemini review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 32c7d34d3c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| shell_func = f"subprocess.{func.attr}" | ||
|
|
||
| if shell_func and shell_func in _SHELL_EXEC_FUNCS: | ||
| all_call_args = list(node.args) + [kw.value for kw in node.keywords] |
There was a problem hiding this comment.
Exclude non-command keyword values from dynamic checks
The shell-escape analyzer appends every keyword value to all_call_args, so control flags (for example check=True, text=True, capture_output=True) are treated as command inputs. Those boolean nodes fail _is_safe_literal, which makes has_non_literal true and rejects otherwise safe literal calls like subprocess.run(['echo', 'hi'], check=True) as shell_escape_dynamic. This is a functional regression that blocks common subprocess usage in the Python tool.
Useful? React with 👍 / 👎.
| base = os.path.basename(token).lower() | ||
| if base in _BLOCKED_COMMANDS: |
There was a problem hiding this comment.
Match blocked commands only at executable positions
_find_blocked_commands currently checks os.path.basename(token) for every token, not just command positions, so benign arguments whose basename matches a blocked command are denied. For example, commands like echo /tmp/rm or cat /data/kill are flagged even though they do not invoke rm/kill. Because this helper is reused across terminal and Python checks, it creates broad false positives and unnecessary command rejection.
Useful? React with 👍 / 👎.
| elif node.module == "os": | ||
| self.os_aliases.add("os") |
There was a problem hiding this comment.
Track imported os callables in ImportFrom analysis
The ImportFrom handler for os only records the module name and does not capture callable aliases imported directly. Since visit_Call shell detection only handles attribute calls, code like from os import system followed by system('sudo whoami') is not recognized as a shell escape and passes _check_code_safety. This leaves a straightforward bypass for blocked-command execution via direct os imports.
Useful? React with 👍 / 👎.
bash -c 'script' arg0 arg1 -- only tokens[i+1] is the script body; subsequent tokens are $0, $1 positional parameters passed to the script and are not executed as shell commands. Scanning all remaining tokens caused false positives.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request enhances the security of the sandbox environment by implementing more robust command-injection detection and restricting subprocess execution. Key changes include the introduction of a comprehensive blocklist for dangerous commands, improved shell-escape detection using shlex and regex, and the application of stricter environment and privilege-dropping configurations for subprocesses. My review highlights several bypass opportunities: the Windows-specific tokenization fails to strip quotes, nested shell command detection is currently unreliable, the AST checker misses direct imports of dangerous functions, and the regex pattern for shell boundaries is incomplete. I recommend addressing these security gaps to ensure the sandbox remains effective.
| tokens = ( | ||
| shlex.split(command) | ||
| if sys.platform != "win32" | ||
| else shlex.split(command, posix = False) |
There was a problem hiding this comment.
On Windows, shlex.split(posix=False) preserves quotes around tokens. For example, "sudo" becomes ['"sudo"']. This causes the subsequent os.path.basename(token).lower() in _BLOCKED_COMMANDS check to fail because "sudo" is not in the blocklist. You should strip quotes from the token before checking it against the blocklist.
| else shlex.split(command, posix = False) | |
| else shlex.split(command, posix = False) | |
| ) | |
| except ValueError: | |
| tokens = command.split() | |
| for token in tokens: | |
| # Strip quotes for Windows (posix=False) and malformed POSIX tokens | |
| token = token.strip("\"'") | |
| base = os.path.basename(token).lower() |
| if i + 1 < len(tokens): | ||
| for word in tokens[i + 1].lower().split(): | ||
| base = os.path.basename(word) | ||
| if base in _BLOCKED_COMMANDS: |
There was a problem hiding this comment.
Using split() on the nested shell command string is unreliable as it doesn't handle shell metacharacters or complex quoting. For example, bash -c "sudo;ls" would result in ['sudo;ls'], which bypasses the blocklist check. Instead, you should recursively call _find_blocked_commands on the nested command string to leverage the full suite of detection logic (shlex + regex).
# The next token(s) are the shell command string
for j in range(i + 1, len(tokens)):
blocked |= _find_blocked_commands(tokens[j])| shell_func = None | ||
| if isinstance(func, ast.Attribute): | ||
| if isinstance(func.value, ast.Name): | ||
| if func.value.id in self.os_aliases: | ||
| shell_func = f"os.{func.attr}" | ||
| elif func.value.id in self.subprocess_aliases: |
There was a problem hiding this comment.
The AST checker currently only detects dangerous functions when called as attributes (e.g., os.system()). It fails to detect them if they are imported directly (e.g., from os import system; system()). You should track imported names in visit_ImportFrom and check ast.Name nodes in visit_Call to close this bypass.
| shell_func = f"os.{func.attr}" | ||
| elif func.value.id in self.subprocess_aliases: | ||
| shell_func = f"subprocess.{func.attr}" | ||
|
|
| lowered = command.lower() | ||
| for word in _BLOCKED_COMMANDS: | ||
| pattern = ( | ||
| rf"(?:^|[;&|`\n]\s*|[$]\(\s*|<\(\s*)(?:[\w./\\-]*/)?{re.escape(word)}\b" |
There was a problem hiding this comment.
The regex prefix group misses several common shell command boundaries such as quotes (for commands inside strings), parentheses (subshells), and braces. For example, bash -c "sudo ls" might bypass this regex because sudo is preceded by a double quote. Consider adding quotes and other shell metacharacters to the prefix set.
pattern = (
rf"(?:^|[;\u0026|`\n\s\"'({!]\s*|[$]\(\s*|\u003c\(\s*)(?:[\w./\\-]*/)?{re.escape(word)}\b"
)(sudo whoami) was not caught because ( was not in the regex character class for shell command boundaries. Add ( to the set alongside ;, &, |, backtick, newline.
- Track from-imports of dangerous functions (from os import system, from subprocess import run as r, etc.) via shell_exec_aliases dict so bare-name calls are detected by the AST checker - Include the active Python interpreter and virtualenv directories in the sanitized PATH so pip, uv, and Studio packages remain accessible in the sandbox - Add Windows-specific blocked commands (rmdir, takeown, icacls, runas, powershell, pwsh) only on win32 platform - Add os.posix_spawn and os.posix_spawnp to _SHELL_EXEC_FUNCS - Handle tuple literals same as list literals in AST argument inspection (both _extract_strings_from_list and _is_safe_literal)
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1befb8fb17
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # Check from-import aliases: from os import system; system(...) | ||
| shell_func = self.shell_exec_aliases.get(func.id) | ||
|
|
There was a problem hiding this comment.
Detect assigned aliases to os/subprocess exec callables
visit_Call resolves bare-name calls only via self.shell_exec_aliases, which is populated from import statements, so aliases created by assignment are missed. For example, runner = subprocess.run followed by runner(['rm','-rf','/tmp/x']) is treated as safe and bypasses the blocked-command check in _check_code_safety, allowing dangerous commands through the Python tool. Please also track assignment-derived aliases (or conservatively treat such bare calls as unsafe).
Useful? React with 👍 / 👎.
| if sys.platform != "win32": | ||
| popen_kwargs["preexec_fn"] = _sandbox_preexec |
There was a problem hiding this comment.
Avoid preexec_fn in threaded tool execution paths
This now injects _sandbox_preexec via preexec_fn, but Python’s preexec_fn path is unsafe in multi-threaded parents and can deadlock before exec when other threads hold runtime/import locks. The backend runs with background threads, so this can cause intermittent hangs for tool calls rather than clean execution/timeout behavior. Consider moving the sandbox setup to a non-Python launcher or another mechanism that does not execute Python in preexec_fn.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review
This pull request significantly strengthens the security of the code execution sandbox by introducing multi-layered protections. It implements a more sophisticated command blocklist using shlex and regex, a minimal environment configuration to prevent credential leakage, and a pre-execution hook to enforce resource limits and prevent privilege escalation. Additionally, the AST-based safety check is expanded to detect dangerous os and subprocess calls. Review feedback correctly identifies a vulnerability in the AST visitor where direct function imports (e.g., from os import system) could bypass detection and suggests using recursion to more effectively scan nested shell commands.
| s = _extract_string_from_node(arg) | ||
| if s is not None: | ||
| found |= _find_blocked_commands(s) | ||
| strs = _extract_strings_from_list(arg) |
There was a problem hiding this comment.
The current AST analysis for shell escapes has a critical vulnerability: it doesn't handle functions imported directly with from os import system. An attacker could use from os import system; system('sudo ...') to bypass the sandbox, as the visitor only checks for attribute access like os.system(...).
Additionally, the logic here is incorrect. It assumes that any from os import ... statement makes os a valid alias for the module, which is not true and can lead to false positives (e.g., for from os import path).
To fix this, the SignalEscapeVisitor needs to be updated to correctly track imported functions and their aliases.
-
In
SignalEscapeVisitor.__init__, add a dictionary to track aliased functions:self.shell_func_aliases = {}
-
Replace the current logic in
visit_ImportFromforosandsubprocesswith the following to correctly populate the alias dictionary:elif node.module in ("os", "subprocess"): for alias in node.names: full_name = f"{node.module}.{alias.name}" if full_name in _SHELL_EXEC_FUNCS: # Store the alias for the dangerous function self.shell_func_aliases[alias.asname or alias.name] = full_name
-
Update
visit_Callto check for these aliased functions when the called function is anast.Namenode:# In visit_Call, around line 758 shell_func = None if isinstance(func, ast.Attribute): if isinstance(func.value, ast.Name): if func.value.id in self.os_aliases: shell_func = f"os.{func.attr}" elif func.value.id in self.subprocess_aliases: shell_func = f"subprocess.{func.attr}" elif isinstance(func, ast.Name): # Check for directly imported functions like `from os import system` shell_func = self.shell_func_aliases.get(func.id) if shell_func and shell_func in _SHELL_EXEC_FUNCS: # ... rest of the logic is fine
| blocked.add(word) | ||
|
|
||
| # 3. Check for nested shell invocations (bash -c 'sudo whoami'). | ||
| # When a token is an argument to a shell -c flag, scan its words |
There was a problem hiding this comment.
The use of str.split() to parse the nested command string reintroduces a vulnerability that shlex.split() is intended to prevent throughout the rest of the function. An attacker could bypass this check with a quoted command, for example: bash -c '"sudo" whoami'. The .split() call would produce ['"sudo"', 'whoami'], and os.path.basename('"sudo"') would not match sudo in the blocklist.
A more robust approach would be to recursively call _find_blocked_commands on the nested command string. This ensures all the same parsing logic (shlex, regex, etc.) is applied consistently.
# Recursively check the nested command string to handle
# quoted commands and other shell metacharacters.
blocked.update(_find_blocked_commands(tokens[i + 1]))…anning - Only inspect command-carrying keyword arguments (args, command, executable, path, file) in the AST checker, not control flags like check=True, text=True, capture_output=True which are booleans and were incorrectly flagged as non-literal dynamic arguments - Replace split() in nested shell detection with recursive call to _find_blocked_commands so that quoted commands (bash -c '"sudo" whoami') and semicolons (bash -c "sudo;ls") within nested shells are properly detected through the full shlex + regex pipeline
for more information, see https://pre-commit.ci
|
/gemini review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 03249794e3
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| token == "-c" | ||
| and i > 0 | ||
| and os.path.basename(tokens[i - 1]).lower() in _SHELLS |
There was a problem hiding this comment.
Scan Windows
/c payloads for blocked commands
The nested-shell branch only recurses when the flag token is -c, but Windows commands are executed via _get_shell_cmd() as cmd /c .... That means inputs like cmd /c "powershell -Command rm ..." never have their inner command string re-scanned, so blocked executables can pass through when they are inside the /c payload. This is a sandbox bypass specifically on Windows paths.
Useful? React with 👍 / 👎.
| base = os.path.basename(token).lower() | ||
| if base in _BLOCKED_COMMANDS: |
There was a problem hiding this comment.
Unquote Windows tokens before blocked-command matching
In the Windows tokenization path (shlex.split(..., posix=False)), quoted executables keep their quote characters, so os.path.basename(token) on a token like "powershell" does not equal powershell. Because the regex path also expects command-boundary forms without a leading quote, commands such as "powershell" -Command ... evade the blocklist while still executing correctly in cmd.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review
This pull request significantly hardens the sandbox environment for executing code and shell commands by implementing robust command blocking, environment sanitization, and privilege dropping. The changes introduce regex-based command detection, a secure environment builder, and pre-execution hooks to limit system resources and prevent privilege escalation. I have identified two critical areas for improvement: the use of preexec_fn in subprocess.Popen is known to be unsafe in multi-threaded applications due to potential deadlocks, and the hardcoded path to libc.so.6 should be replaced with a dynamic lookup using ctypes.util.find_library('c') to ensure cross-distribution compatibility.
| env = safe_env, | ||
| ) | ||
| if sys.platform != "win32": | ||
| popen_kwargs["preexec_fn"] = _sandbox_preexec |
There was a problem hiding this comment.
The preexec_fn argument in subprocess.Popen is not thread-safe and can cause deadlocks in multi-threaded applications if the child process attempts to acquire locks held by the parent. Since this is a server environment, consider using start_new_session=True or preexec_fn with extreme caution, or preferably, move the privilege dropping logic to a wrapper script executed by the subprocess.
| try: | ||
| import ctypes | ||
|
|
||
| libc = ctypes.CDLL("libc.so.6", use_errno = True) |
Addresses two Gemini review findings:
1. preexec_fn thread safety: _sandbox_preexec previously imported ctypes
and resource inside the function body, which runs between fork() and
exec() in the child process. In a multi-threaded server, this could
deadlock if the import machinery locks were held by another thread at
fork time. Now all imports and the libc handle are resolved once at
module load time, so _sandbox_preexec only calls C-level functions
(prctl, setrlimit) with no Python import activity.
2. Hardcoded libc.so.6 path: replaced with ctypes.util.find_library("c")
which works on glibc (libc.so.6), musl (libc.musl-*.so.1), and other
Linux distributions where libc has a different soname.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request significantly enhances the security of the sandbox environment by implementing robust command blocking, environment sanitization, and privilege restriction for subprocesses. It introduces a more sophisticated detection mechanism for blocked commands using shlex and regex, adds a safe environment builder to strip sensitive credentials, and applies pre-execution hooks to limit system resources and prevent privilege escalation. Additionally, it extends the AST-based static analysis to detect potential shell escapes in Python code. My feedback suggests optimizing the regex-based command detection by compiling a single pattern, simplifying the path deduplication logic using dict.fromkeys, and moving the _CMD_KWARGS constant to a higher scope to avoid redundant redefinitions.
| lowered = command.lower() | ||
| for word in _BLOCKED_COMMANDS: | ||
| pattern = ( | ||
| rf"(?:^|[;&|`\n(]\s*|[$]\(\s*|<\(\s*)(?:[\w./\\-]*/)?{re.escape(word)}\b" | ||
| ) | ||
| if re.search(pattern, lowered): | ||
| blocked.add(word) |
There was a problem hiding this comment.
This loop iterates through each blocked command, compiling and running a regex for each one. For better performance, you can combine all blocked words into a single regex pattern. This avoids repeated regex compilation and searching within the loop.
| lowered = command.lower() | |
| for word in _BLOCKED_COMMANDS: | |
| pattern = ( | |
| rf"(?:^|[;&|`\n(]\s*|[$]\(\s*|<\(\s*)(?:[\w./\\-]*/)?{re.escape(word)}\b" | |
| ) | |
| if re.search(pattern, lowered): | |
| blocked.add(word) | |
| lowered = command.lower() | |
| if _BLOCKED_COMMANDS: | |
| blocked_words_pattern = "|".join(re.escape(w) for w in _BLOCKED_COMMANDS) | |
| pattern = rf"(?:^|[;&|\u0060\n(]\s*|[$]\(\s*|<\(\s*)(?:[\w./\\-]*/)?({blocked_words_pattern})\b" | |
| blocked.update(re.findall(pattern, lowered)) |
| seen = set() | ||
| deduped = [] | ||
| for p in path_entries: | ||
| if p and p not in seen: | ||
| seen.add(p) | ||
| deduped.append(p) |
There was a problem hiding this comment.
This pattern for deduplicating a list while preserving order can be expressed more concisely using dict.fromkeys(), which is order-preserving since Python 3.7. This improves code readability.
| seen = set() | |
| deduped = [] | |
| for p in path_entries: | |
| if p and p not in seen: | |
| seen.add(p) | |
| deduped.append(p) | |
| deduped = list(dict.fromkeys(p for p in path_entries if p)) |
| if shell_func and shell_func in _SHELL_EXEC_FUNCS: | ||
| # Only inspect keyword args that carry command content, not | ||
| # control flags like check=True, text=True, capture_output=True. | ||
| _CMD_KWARGS = {"args", "command", "executable", "path", "file"} |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e2deca7a8c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| token == "-c" | ||
| and i > 0 | ||
| and os.path.basename(tokens[i - 1]).lower() in _SHELLS |
There was a problem hiding this comment.
Handle combined shell options in nested command scanning
The nested-shell recursion only triggers when the option token is exactly -c and the immediately previous token is the shell name, so commands like bash -lc 'rm -rf /tmp/x' or bash --login -c 'sudo ...' are treated as safe and pass the blocklist. This bypasses the new nested-shell protection in both terminal execution (_bash_exec uses _find_blocked_commands directly) and Python static checks (_check_args_for_blocked also relies on _find_blocked_commands), allowing blocked commands to execute via common shell flag combinations.
Useful? React with 👍 / 👎.
…nt hoisting - Combine per-word regex loop into a single re.findall with alternation pattern, avoiding repeated regex compilation and searching - Replace manual dedup loop with dict.fromkeys for PATH entries - Hoist _CMD_KWARGS frozenset out of visit_Call to avoid recreating it on every AST node visit
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request significantly enhances the security of the code execution sandbox by introducing robust command blocking, environment sanitization, and resource limits. Key additions include a more sophisticated blocked command detector using shlex and regex, a minimal environment builder, and AST-based analysis to prevent shell escapes in Python code. Feedback was provided to improve the nested shell detection logic by making it cross-platform, specifically addressing Windows-specific shell invocations like 'cmd /c' which were previously overlooked.
| _SHELLS = {"bash", "sh", "zsh", "dash", "ksh", "csh", "tcsh", "fish"} | ||
| for i, token in enumerate(tokens): | ||
| if ( | ||
| token == "-c" | ||
| and i > 0 | ||
| and os.path.basename(tokens[i - 1]).lower() in _SHELLS | ||
| ): |
There was a problem hiding this comment.
The nested shell detection logic is specific to Unix-like shells and the -c flag. It will not detect nested commands on Windows that use cmd /c, which is a potential security gap as cmd is not in the blocklist.
To make this check cross-platform, you should use platform-specific shell names and command flags.
| _SHELLS = {"bash", "sh", "zsh", "dash", "ksh", "csh", "tcsh", "fish"} | |
| for i, token in enumerate(tokens): | |
| if ( | |
| token == "-c" | |
| and i > 0 | |
| and os.path.basename(tokens[i - 1]).lower() in _SHELLS | |
| ): | |
| if sys.platform == "win32": | |
| _SHELLS, c_flag = {"cmd"}, "/c" | |
| else: | |
| _SHELLS, c_flag = {"bash", "sh", "zsh", "dash", "ksh", "csh", "tcsh", "fish"}, "-c" | |
| for i, token in enumerate(tokens): | |
| # On Windows, flags can be case-insensitive. | |
| is_c_flag = token.lower() == c_flag if sys.platform == "win32" else token == c_flag | |
| if ( | |
| is_c_flag | |
| and i > 0 | |
| and os.path.basename(tokens[i - 1]).lower() in _SHELLS | |
| ): |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 57ca43792c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| lowered = command.lower() | ||
| if _BLOCKED_COMMANDS: | ||
| words_alt = "|".join(re.escape(w) for w in sorted(_BLOCKED_COMMANDS)) | ||
| pattern = rf"(?:^|[;&|`\n(]\s*|[$]\(\s*|<\(\s*)(?:[\w./\\-]*/)?({words_alt})\b" |
There was a problem hiding this comment.
Block commands reconstructed from command substitutions
The blocklist scan only matches blocked executables when they appear literally in the command string (or immediately after $(), but bash lets callers synthesize the executable name from substitution output. For example, $(printf rm) -rf /tmp/x executes rm while _find_blocked_commands returns no hit, so _bash_exec allows the command. This is a practical bypass for destructive commands in Unix terminal execution.
Useful? React with 👍 / 👎.
The nested shell scan only checked for Unix shells (bash -c, sh -c, etc). Add cmd /c and cmd.exe /c detection so that Windows nested shell invocations are also recursively scanned for blocked commands. The token scan already catches blocked commands at any position, so this is defense-in-depth for consistency across platforms.
for more information, see https://pre-commit.ci
|
/gemini review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d029e2f674
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| for token in tokens: | ||
| base = os.path.basename(token).lower() | ||
| if base in _BLOCKED_COMMANDS: |
There was a problem hiding this comment.
Block variable-expanded commands in shell checks
_find_blocked_commands only matches literal command tokens/regex words and never accounts for shell variable expansion, so blocked executables can be reconstructed at runtime and evade the filter. For example, cmd=rm; $cmd -rf /tmp/x produces no hit here, yet _bash_exec will execute rm; the same helper is also reused by the Python safety checker, so this bypass applies there too.
Useful? React with 👍 / 👎.
| if isinstance(func, ast.Attribute): | ||
| if isinstance(func.value, ast.Name): | ||
| if func.value.id in self.os_aliases: | ||
| shell_func = f"os.{func.attr}" | ||
| elif func.value.id in self.subprocess_aliases: |
There was a problem hiding this comment.
Flag indirect os/subprocess call targets as unsafe
The AST call resolution only recognizes attribute calls rooted at a simple name (e.g., os.system) or tracked bare-name aliases, so indirect forms such as getattr(os, 'system')(...) and importlib.import_module('subprocess').run(...) are skipped entirely. In those cases _check_code_safety returns None, which allows blocked commands to execute through the Python tool despite this hardening pass.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review
This pull request enhances the security of the execution sandbox by introducing a robust command blocklist, a minimal environment builder, and a pre-execution hook to drop privileges and set resource limits. It also extends the AST-based safety checker to identify dangerous shell invocations. Reviewer feedback suggests improving the command detection logic by including Python interpreters in nested shell checks, refining token matching to avoid false positives, fixing Windows path compatibility in regex patterns, and pre-compiling regexes for better performance.
| _SHELLS = {"bash", "sh", "zsh", "dash", "ksh", "csh", "tcsh", "fish"} | ||
| _SHELLS_WIN = {"cmd", "cmd.exe"} |
There was a problem hiding this comment.
The nested shell detection logic currently only covers standard Unix shells and cmd.exe. However, a common bypass for terminal blocklists is to use python -c or perl -e to execute arbitrary code (e.g., python -c 'import os; os.system("sudo ...")'). Since the terminal tool allows running the Python interpreter, these should be included in the recursion list to ensure the nested command string is also scanned.
| _SHELLS = {"bash", "sh", "zsh", "dash", "ksh", "csh", "tcsh", "fish"} | |
| _SHELLS_WIN = {"cmd", "cmd.exe"} | |
| _SHELLS = {"bash", "sh", "zsh", "dash", "ksh", "csh", "tcsh", "fish", "python", "python3"} | |
| _SHELLS_WIN = {"cmd", "cmd.exe", "python.exe"} |
| for token in tokens: | ||
| base = os.path.basename(token).lower() | ||
| if base in _BLOCKED_COMMANDS: | ||
| blocked.add(base) |
There was a problem hiding this comment.
The current logic in Step 1 iterates over all tokens and blocks the command if any token's basename matches a blocked command. This leads to false positives where a blocked word is used as an argument or within a string, e.g., echo sudo or git commit -m "fixed sudo bug". The blocklist should ideally only apply to tokens in command positions. Since Step 2 (regex scanning) already attempts to identify command boundaries, consider making Step 1 more selective (e.g., only checking the first token) or relying more on the regex/tokenization boundaries.
| words_alt = "|".join(re.escape(w) for w in sorted(_BLOCKED_COMMANDS)) | ||
| pattern = rf"(?:^|[;&|`\n(]\s*|[$]\(\s*|<\(\s*)(?:[\w./\\-]*/)?({words_alt})\b" |
There was a problem hiding this comment.
The regex pattern for detecting blocked commands at shell boundaries only accounts for forward slashes in path prefixes. On Windows, commands may be invoked with backslash separators (e.g., C:\Windows\System32\sudo). The character class [\w./\-]* correctly includes the backslash, but the trailing / in the non-capturing group (?:[\w./\-]*/)? prevents it from matching paths that use backslashes as the final separator before the command name.
| words_alt = "|".join(re.escape(w) for w in sorted(_BLOCKED_COMMANDS)) | |
| pattern = rf"(?:^|[;&|`\n(]\s*|[$]\(\s*|<\(\s*)(?:[\w./\\-]*/)?({words_alt})\b" | |
| words_alt = "|".join(re.escape(w) for w in sorted(_BLOCKED_COMMANDS)) | |
| pattern = rf"(?:^|[;&|\n(]\s*|[$]\(\s*|<\(\s*)(?:[\w./\\-]*[\\/])?({words_alt})\b" |
| if _BLOCKED_COMMANDS: | ||
| words_alt = "|".join(re.escape(w) for w in sorted(_BLOCKED_COMMANDS)) | ||
| pattern = rf"(?:^|[;&|`\n(]\s*|[$]\(\s*|<\(\s*)(?:[\w./\\-]*/)?({words_alt})\b" | ||
| blocked.update(re.findall(pattern, lowered)) |
There was a problem hiding this comment.
… -c) The nested shell scan only matched token == "-c" with the immediately preceding token being a shell name. This missed: - Combined flags: bash -lc 'rm ...' (-lc ends with c, is a valid combined flag meaning -l -c) - Interleaved flags: bash --login -c 'sudo ...' (--login sits between bash and -c) Now matches any short flag ending in 'c' (e.g. -lc, -xc, -ic) and walks backwards past intermediate flags to find the shell binary.
for more information, see https://pre-commit.ci
|
/gemini review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 539dc648db
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if prev.startswith("-") or prev.startswith("/"): | ||
| continue # skip flags |
There was a problem hiding this comment.
Recognize absolute shell paths in nested -c scanning
In _find_blocked_commands, the backward scan skips any token starting with / as if it were a flag, which causes absolute shell executables like /bin/bash and /bin/sh to be ignored. As a result, commands such as /bin/bash -c "rm -rf /tmp/x" are not recursively scanned and pass the blocklist, creating a direct bypass in both terminal execution and Python argument checks that rely on this helper.
Useful? React with 👍 / 👎.
| fq = f"{node.module}.{alias.name}" | ||
| if fq in _SHELL_EXEC_FUNCS: | ||
| self.shell_exec_aliases[alias.asname or alias.name] = fq |
There was a problem hiding this comment.
Reject wildcard os/subprocess imports in safety analysis
visit_ImportFrom only records explicit imported names, so wildcard imports are treated as safe even when they expose dangerous callables. With from subprocess import * (or from os import *), later calls like run([...])/system(...) are unresolved and _check_code_safety returns None, which bypasses the new shell-escape guard for blocked commands.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review
This pull request strengthens the execution sandbox by introducing environment sanitization, resource limits, and enhanced command blocking using shlex and regex. It also adds an AST-based safety checker to detect dangerous os and subprocess calls in Python code. Review feedback identifies several security improvements, including accounting for Windows drive letters in regex patterns, tracking local variable assignments in the AST checker to prevent aliasing bypasses, and including the Python interpreter in the recursive shell scanning to prevent command execution pivots.
| lowered = command.lower() | ||
| if _BLOCKED_COMMANDS: | ||
| words_alt = "|".join(re.escape(w) for w in sorted(_BLOCKED_COMMANDS)) | ||
| pattern = rf"(?:^|[;&|`\n(]\s*|[$]\(\s*|<\(\s*)(?:[\w./\\-]*/)?({words_alt})\b" |
There was a problem hiding this comment.
The regex pattern for detecting blocked commands with path prefixes does not account for Windows drive letters (e.g., C:). This could allow a bypass on Windows if a blocked command is invoked using an absolute path starting with a drive letter in a multi-command string (e.g., ls & C:\Windows\System32\runas.exe).
| pattern = rf"(?:^|[;&|`\n(]\s*|[$]\(\s*|<\(\s*)(?:[\w./\\-]*/)?({words_alt})\b" | |
| pattern = rf"(?:^|[;&|`\n(]\s*|[$]\(\s*|<\(\s*)(?:[\w./\\-:]*/|[a-zA-Z]:\\)?({words_alt})\b" |
| # Check from-import aliases: from os import system; system(...) | ||
| shell_func = self.shell_exec_aliases.get(func.id) | ||
|
|
||
| if shell_func and shell_func in _SHELL_EXEC_FUNCS: |
There was a problem hiding this comment.
The AST safety checker for Python code execution does not track local variable assignments. An attacker could bypass the check by assigning a dangerous function to a local variable before calling it (e.g., s = os.system; s('sudo ...')). While PR_SET_NO_NEW_PRIVS provides a strong kernel-level defense on Linux, this bypass remains a risk on other platforms like macOS where such kernel-level restrictions are not applied in this PR.
| tokens = set(command.lower().split()) | ||
| blocked = tokens & _BASH_BLOCKED_WORDS | ||
| # Block dangerous commands (shlex + regex based) | ||
| blocked = _find_blocked_commands(command) |
There was a problem hiding this comment.
The terminal tool's blocklist can be bypassed by pivoting through the Python interpreter. Since python is not in the _SHELLS list for recursive scanning, an LLM could execute a blocked command via python -c "import os; os.system('sudo ...')". Although the subprocess environment is sanitized and restricted by PR_SET_NO_NEW_PRIVS on Linux, this bypass allows executing commands that the blocklist specifically intends to prevent, especially on non-Linux platforms.
Addresses three high-consensus findings from 20-reviewer pass: 1. /bin/bash -c 'sudo whoami' bypassed nested shell scan because the backwards flag-skip logic treated paths starting with / as flags. Now only skips tokens starting with - as Unix flags; on Windows only skips short /X flags (not /bin/bash style paths). [9/20] 2. RLIMIT_NPROC=256 caused subprocess.run to fail with EAGAIN because Linux enforces NPROC per real UID, not per process tree. Removed RLIMIT_NPROC entirely; RLIMIT_FSIZE and PR_SET_NO_NEW_PRIVS remain as the primary resource and privilege controls. [5/20] 3. AST checker rejected safe dynamic subprocess usage like cmd=["git","status"]; subprocess.run(cmd) as shell_escape_dynamic. Now only flags dynamic args for shell-string functions (os.system, os.popen, subprocess.getoutput, etc.) or when shell=True is explicitly set. List-based subprocess calls with shell=False (the default) do not pass through a shell and are not flagged. [12/20]
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e35e8a0990
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| shell_kwarg_true = any( | ||
| kw.arg == "shell" | ||
| and isinstance(kw.value, ast.Constant) | ||
| and kw.value.value is True | ||
| for kw in node.keywords |
There was a problem hiding this comment.
Treat non-literal shell flags as unsafe
The dynamic-shell guard only activates when shell is a literal True, so calls like s = True; subprocess.run(cmd, shell=s) skip the dynamic-argument check entirely. In that path, _check_code_safety returns None even when cmd contains blocked commands (for example rm -f ...), and _python_exec will run them. This creates a straightforward bypass whenever shell is passed via a variable or expression.
Useful? React with 👍 / 👎.
| cmd_kw_values = [ | ||
| kw.value | ||
| for kw in node.keywords | ||
| if kw.arg in _CMD_KWARGS or kw.arg is None # **kwargs | ||
| ] |
There was a problem hiding this comment.
Expand kwargs dictionaries before command safety checks
When subprocess/os calls use **kwargs, the analyzer stores the whole kwargs expression as one opaque node instead of inspecting the contained keys/values, and _check_args_for_blocked only understands string/list literals. That means subprocess.run(**{"args": "rm -f x.txt", "shell": True}) is treated as safe and executes blocked commands, because neither the command string nor shell=True is actually extracted from the expanded kwargs payload.
Useful? React with 👍 / 👎.
…ction Gemini review found that Windows absolute paths (C:\Windows\System32\ shutdown.exe) and executable extensions (.exe, .com, .bat, .cmd) were not handled: - Token scan now strips .exe/.com/.bat/.cmd extensions before checking the blocklist, so sudo.exe matches sudo, shutdown.bat matches shutdown - Regex pattern now includes optional Windows drive letter prefix ([a-zA-Z]:[/\\]) and optional executable extension suffix, so commands after shell metacharacters with full Windows paths are also caught
|
/gemini review |
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ac7cb80864
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if shell_func in _STRING_SHELL_FUNCS or shell_kwarg_true: | ||
|
|
There was a problem hiding this comment.
Reject dynamic command paths in os.exec/spawn calls**
Dynamic-argument rejection is currently gated behind shell_func in _STRING_SHELL_FUNCS or shell_kwarg_true, which excludes direct-exec APIs like os.execvp, os.execvpe, and os.spawn*. As a result, blocked executables can bypass _check_code_safety when provided via variables (for example cmd='rm'; os.execvp(cmd, [cmd, '-f', 'victim.txt'])), and the Python tool then executes them. Since these functions directly launch binaries without shell=True, this condition leaves a practical sandbox bypass for the new blocklist logic.
Useful? React with 👍 / 👎.
…ion false positive
Addresses three findings from second 20-reviewer pass:
1. **kwargs dict expansion (9/20): subprocess.run(**{"args": "rm ...",
"shell": True}) bypassed the AST checker because **kwargs were
treated as opaque. Now expands literal dict **kwargs to inspect
their keys, and flags opaque **kwargs (variable dicts) as unsafe.
2. Non-literal shell= values (7/20): shell=variable was treated as
shell=False (safe). Now any shell= value that is not literally
False is treated as potentially True (conservative default).
3. except Exception false positive (1/20): except Exception in a loop
was flagged as timeout evasion, but Exception does not catch
SystemExit or KeyboardInterrupt which are used for timeout
enforcement. Narrowed to only flag except BaseException and
except TimeoutError in loops.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 83de70d651
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if _libc is not None: | ||
| try: | ||
| # PR_SET_NO_NEW_PRIVS = 38, arg2 = 1 (enable) | ||
| _libc.prctl(38, 1, 0, 0, 0) |
There was a problem hiding this comment.
Check prctl result before assuming no-new-privs is active
_sandbox_preexec() calls prctl through ctypes, but syscall failures return -1 and set errno rather than raising OSError; the current try/except therefore treats denied/unsupported PR_SET_NO_NEW_PRIVS calls as success. In environments with older kernels or restrictive seccomp profiles, this silently disables the intended kernel-level privilege guard while execution continues normally, weakening the hardening this commit adds.
Useful? React with 👍 / 👎.
| env = { | ||
| "PATH": os.pathsep.join(deduped), | ||
| "HOME": workdir, | ||
| "TMPDIR": workdir, |
There was a problem hiding this comment.
Set Windows temp variables in sandboxed child environment
The sanitized environment only sets TMPDIR, but Windows tooling (including Python's tempfile behavior) relies on TEMP/TMP, so sandboxed subprocesses on Windows will write temporary files to the global system temp directory instead of the per-session workdir. This is a sandbox-isolation regression specific to Windows runs and can leak artifacts across sessions.
Useful? React with 👍 / 👎.
…nslothai#4827) * fix(studio): harden sandbox security for terminal and python tools The existing command blocklist used naive str.split() which is trivially bypassable via quoting, full paths, nested shells, variable expansion, and cross-tool pivoting through Python os.system/subprocess. Fixes unslothai#4818. Changes: - Replace str.split() blocklist with shlex.split() + os.path.basename() tokenization and regex scanning at shell command boundaries - Add sanitized subprocess environment (_build_safe_env) that strips credentials (HF_TOKEN, WANDB_API_KEY, GH_TOKEN, AWS_*, etc.) and restricts PATH to /usr/local/bin:/usr/bin:/bin - Add PR_SET_NO_NEW_PRIVS via prctl on Linux so sudo/su/pkexec fail at the kernel level regardless of how they are invoked - Add RLIMIT_NPROC (256) and RLIMIT_FSIZE (100MB) to prevent fork bombs and disk filling attacks - Extend AST safety checker to detect os.system(), os.popen(), subprocess.run/Popen/call/check_output, os.exec*, os.spawn* calls containing blocked commands or dynamic (non-literal) arguments - Add cross-platform support: cmd.exe on Windows, bash on Unix; CREATE_NO_WINDOW flag on Windows, preexec_fn on Unix - Expand blocklist from 7 to 14 commands: add su, chown, passwd, mount, umount, fdisk, kill, killall, pkill - Apply all layers to both _bash_exec and _python_exec Zero measurable performance overhead -- shlex parsing and a single prctl syscall per subprocess fork. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix review findings: exception_catching dead code, false positives, process substitution - Include exception_catching reasons in _check_code_safety so bare except-in-loop timeout evasion is actually blocked (was computed in _check_signal_escape_patterns but never read by the caller) - Remove base.split() inner loop that caused false positives on quoted text arguments containing blocked words (e.g. echo "kill this process") - Add targeted nested shell detection for bash/sh/zsh -c arguments instead, which catches bash -c 'sudo whoami' without false positives - Add <() process substitution to the regex character class so diff <(rm -rf /path) is also caught - Fix error message to say "unsafe patterns" instead of specifically mentioning signal manipulation when other categories trigger * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Address review feedback: regex paths, keyword args, list element scanning - Regex now matches blocked commands after optional path prefix at shell boundaries (catches ls; /usr/bin/sudo and similar) - Nested shell detection uses os.path.basename so bash -c "/bin/rm" is caught - AST checker now inspects keyword arguments (not just positional) so subprocess.run(args="sudo ...", shell=True) is detected - List elements in subprocess calls are now checked via _find_blocked_commands for consistency (catches subprocess.run(["bash", "-c", "rm -rf /"])) - Dynamic argument check uses _is_safe_literal that validates list contents are all string literals * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix nested shell scan to only check the script body, not positional args bash -c 'script' arg0 arg1 -- only tokens[i+1] is the script body; subsequent tokens are $0, $1 positional parameters passed to the script and are not executed as shell commands. Scanning all remaining tokens caused false positives. * Add subshell parentheses to regex command boundary detection (sudo whoami) was not caught because ( was not in the regex character class for shell command boundaries. Add ( to the set alongside ;, &, |, backtick, newline. * Address high-priority review findings from 7 parallel reviewers - Track from-imports of dangerous functions (from os import system, from subprocess import run as r, etc.) via shell_exec_aliases dict so bare-name calls are detected by the AST checker - Include the active Python interpreter and virtualenv directories in the sanitized PATH so pip, uv, and Studio packages remain accessible in the sandbox - Add Windows-specific blocked commands (rmdir, takeown, icacls, runas, powershell, pwsh) only on win32 platform - Add os.posix_spawn and os.posix_spawnp to _SHELL_EXEC_FUNCS - Handle tuple literals same as list literals in AST argument inspection (both _extract_strings_from_list and _is_safe_literal) * Fix false positive on check=True kwargs and recursive nested shell scanning - Only inspect command-carrying keyword arguments (args, command, executable, path, file) in the AST checker, not control flags like check=True, text=True, capture_output=True which are booleans and were incorrectly flagged as non-literal dynamic arguments - Replace split() in nested shell detection with recursive call to _find_blocked_commands so that quoted commands (bash -c '"sudo" whoami') and semicolons (bash -c "sudo;ls") within nested shells are properly detected through the full shlex + regex pipeline * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Move preexec_fn imports to module level and use find_library for libc Addresses two Gemini review findings: 1. preexec_fn thread safety: _sandbox_preexec previously imported ctypes and resource inside the function body, which runs between fork() and exec() in the child process. In a multi-threaded server, this could deadlock if the import machinery locks were held by another thread at fork time. Now all imports and the libc handle are resolved once at module load time, so _sandbox_preexec only calls C-level functions (prctl, setrlimit) with no Python import activity. 2. Hardcoded libc.so.6 path: replaced with ctypes.util.find_library("c") which works on glibc (libc.so.6), musl (libc.musl-*.so.1), and other Linux distributions where libc has a different soname. * Apply Gemini style suggestions: combined regex, dict.fromkeys, constant hoisting - Combine per-word regex loop into a single re.findall with alternation pattern, avoiding repeated regex compilation and searching - Replace manual dedup loop with dict.fromkeys for PATH entries - Hoist _CMD_KWARGS frozenset out of visit_Call to avoid recreating it on every AST node visit * Add cmd /c nested shell detection for Windows parity The nested shell scan only checked for Unix shells (bash -c, sh -c, etc). Add cmd /c and cmd.exe /c detection so that Windows nested shell invocations are also recursively scanned for blocked commands. The token scan already catches blocked commands at any position, so this is defense-in-depth for consistency across platforms. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Handle combined shell flags (-lc, -xc) and interleaved flags (--login -c) The nested shell scan only matched token == "-c" with the immediately preceding token being a shell name. This missed: - Combined flags: bash -lc 'rm ...' (-lc ends with c, is a valid combined flag meaning -l -c) - Interleaved flags: bash --login -c 'sudo ...' (--login sits between bash and -c) Now matches any short flag ending in 'c' (e.g. -lc, -xc, -ic) and walks backwards past intermediate flags to find the shell binary. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix /bin/bash bypass, remove RLIMIT_NPROC, reduce AST false positives Addresses three high-consensus findings from 20-reviewer pass: 1. /bin/bash -c 'sudo whoami' bypassed nested shell scan because the backwards flag-skip logic treated paths starting with / as flags. Now only skips tokens starting with - as Unix flags; on Windows only skips short /X flags (not /bin/bash style paths). [9/20] 2. RLIMIT_NPROC=256 caused subprocess.run to fail with EAGAIN because Linux enforces NPROC per real UID, not per process tree. Removed RLIMIT_NPROC entirely; RLIMIT_FSIZE and PR_SET_NO_NEW_PRIVS remain as the primary resource and privilege controls. [5/20] 3. AST checker rejected safe dynamic subprocess usage like cmd=["git","status"]; subprocess.run(cmd) as shell_escape_dynamic. Now only flags dynamic args for shell-string functions (os.system, os.popen, subprocess.getoutput, etc.) or when shell=True is explicitly set. List-based subprocess calls with shell=False (the default) do not pass through a shell and are not flagged. [12/20] * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Handle Windows drive letter paths and .exe extensions in command detection Gemini review found that Windows absolute paths (C:\Windows\System32\ shutdown.exe) and executable extensions (.exe, .com, .bat, .cmd) were not handled: - Token scan now strips .exe/.com/.bat/.cmd extensions before checking the blocklist, so sudo.exe matches sudo, shutdown.bat matches shutdown - Regex pattern now includes optional Windows drive letter prefix ([a-zA-Z]:[/\\]) and optional executable extension suffix, so commands after shell metacharacters with full Windows paths are also caught * Handle **kwargs dict expansion, non-literal shell=, and except Exception false positive Addresses three findings from second 20-reviewer pass: 1. **kwargs dict expansion (9/20): subprocess.run(**{"args": "rm ...", "shell": True}) bypassed the AST checker because **kwargs were treated as opaque. Now expands literal dict **kwargs to inspect their keys, and flags opaque **kwargs (variable dicts) as unsafe. 2. Non-literal shell= values (7/20): shell=variable was treated as shell=False (safe). Now any shell= value that is not literally False is treated as potentially True (conservative default). 3. except Exception false positive (1/20): except Exception in a loop was flagged as timeout evasion, but Exception does not catch SystemExit or KeyboardInterrupt which are used for timeout enforcement. Narrowed to only flag except BaseException and except TimeoutError in loops. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Why the existing sandbox-hardening code is edited (FA3 rationale): - `_sandbox_preexec` (tools.py:241) and its `_libc.prctl(PR_SET_NO_NEW_PRIVS)` block (tools.py:257-260) were introduced by unslothai#4827 / unslothai#5375 to harden tool subprocesses. That contract is preserved: the original function is renamed to `_sandbox_preexec_impl(apply_no_new_privs)` and a same-named `_sandbox_preexec()` wrapper calls it with `apply_no_new_privs=True`, so every direct caller still sets NNP. The new `_sandbox_preexec_for_bwrap()` skips NNP only on the Linux+bwrap path where bwrap itself reapplies NNP to the inner payload (setuid-helper bwrap installs cannot acquire helper privileges if NNP is set before execve). Net result: the inner sandboxed program still runs with NNP=1. - The `if sys.platform != "win32": popen_kwargs["preexec_fn"] = ...` / `else: popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW` blocks in `_python_exec` (tools.py:1615-1618) and `_bash_exec` (tools.py:1709-1712) come from unslothai#4827. They are reorganised, not removed: the non-Windows path still sets `preexec_fn` and the Windows path still sets `CREATE_NO_WINDOW`. The only behavioural delta is the bwrap branch (Linux + sandbox_available()) selecting `_sandbox_preexec_for_bwrap` per above. - `_workdirs[key] = workdir` (tools.py:348) from unslothai#4416 is replaced with `_workdirs[key] = os.path.realpath(workdir)`. Same caching behaviour; the realpath normalization is required because the bwrap argv binds `os.path.realpath(workdir)` while `tempfile.mkstemp(dir=workdir)` builds tmp_path from the non-realpath value, so on hosts where HOME or a parent is a symlink the script path passed via inner_argv is not reachable inside the sandbox namespace. studio/backend/core/inference/sandbox.py - Narrow Linux /etc bind from the whole directory to a runtime allowlist (hosts, resolv.conf, nsswitch.conf, localtime, ld.so.cache, ld.so.conf and ld.so.conf.d, ssl, ca-certificates, pki) so sandboxed tools cannot read sshd_config, machine-id, cloud-init data, etc. - Narrow Seatbelt /private/etc the same way; replace the broad subpath with literal entries for hosts, resolv.conf, nsswitch.conf, localtime, protocols, services and subpaths for ssl and ca-certificates. - Constrain Seatbelt allow process-exec to the same subpaths already used in file-map-executable so sandboxed Python cannot exec arbitrary system binaries such as osascript. - Remove allow file-read and file-write on literal /dev/tty from the Seatbelt profile; tool output is captured via PIPE and there is no legitimate path that needs to bypass it. - Gate site.getusersitepackages() behind UNSLOTH_STUDIO_SANDBOX_ALLOW_USER_SITE=1 so the sandbox no longer binds the real ~/.local/lib subtree by default (HOME is already rewritten to workdir for the child). - Resolve _BWRAP_PROBE_BIN via shutil.which at module load with the /usr/bin/true fallback, so the probe works on Alpine and BusyBox layouts that place coreutils under /bin instead of /usr/bin. - Replace the unreachable _linux_bwrap_path-or-bwrap fallback with an assertion that names the sandbox_available contract; same for the unreachable RuntimeError in build_sandbox_argv. studio/backend/run.py - Run sandbox_available on a background thread with a 2s bounded wait. On hardened kernels (unprivileged_userns_clone=0) the probe hangs for its full 5s timeout; running it inline delayed the access banner. The probe still runs (and caches) eagerly; if it has not reported back within 2s the banner is printed first and the unavailable notice is suppressed for this call.
Summary
Fixes #4818. The terminal and python tool sandboxing in Studio could be bypassed to run privileged commands like sudo and chmod (as reported and verified by the issue author).
The root cause is the command blocklist using naive str.split(), which is trivially bypassed via quoting, full paths, nested shells, shell metacharacters, subshells, process substitution, and cross-tool pivoting through Python os.system/subprocess.
This PR hardens sandbox security with multiple independent defense layers, all cross-platform (Linux, macOS, Windows) and with zero measurable performance overhead.
Changes
Better command tokenization -- shlex.split + os.path.basename + regex scanning at shell command boundaries (subshells, process substitution, optional path prefixes). Targeted nested shell detection for bash/sh/zsh -c arguments.
Sanitized subprocess environment -- build_safe_env strips all credentials (HF_TOKEN, WANDB_API_KEY, GH_TOKEN, AWS, LD_PRELOAD, DYLD_) while preserving the active Python interpreter and virtualenv directories in PATH.
OS-level privilege restriction -- PR_SET_NO_NEW_PRIVS via prctl on Linux makes sudo/su/pkexec fail at the kernel level. RLIMIT_NPROC (256) prevents fork bombs. RLIMIT_FSIZE (100MB) prevents disk filling.
Extended AST safety checker -- Detects os.system, os.popen, subprocess.run/Popen/call, os.exec*, os.spawn*, os.posix_spawn* with blocked commands or dynamic args. Tracks aliased imports (import os as myos) and from-imports (from subprocess import run as r). Inspects keyword arguments. Handles list and tuple literals. Includes exception_catching detection for timeout evasion.
Cross-platform shell support -- cmd /c on Windows, bash -c on Unix. CREATE_NO_WINDOW on Windows. Windows-specific blocked commands (rmdir, takeown, icacls, runas, powershell, pwsh).
Expanded blocklist -- From 7 to 14+ commands.
Test plan