Skip to content
23 changes: 18 additions & 5 deletions docs/design/tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,11 +176,24 @@ Strategy selection via `sandboxing.docker.lifecycle.strategy` in `SandboxingConf
The sidecar container shares the sandbox container's lifetime (created and destroyed
together, since they share a network namespace).

> **Status**: The lifecycle protocol, config, factory, and three strategy
> implementations are complete. Integration into `DockerSandbox.execute()` is
> in progress; the `owner_id` parameter is accepted and the config field is
> wired, but the Docker backend does not yet dispatch to the lifecycle strategy.
> Until wired, all executions use the current per-call ephemeral behaviour.
The configured default is `per-agent` (the `strategy` field default in
`SandboxLifecycleConfig`); the table above is authoritative. The strategy is
constructed at boot (`workers/runtime_builder`) with the application clock and
injected into `DockerSandbox` via the sandbox factory. Each tool call runs as
a `docker exec` inside a long-lived idle container (`tail -f /dev/null`
entrypoint) the strategy acquires; per-agent and per-task reuse the container
across calls while per-call destroys it immediately after the single exec. The
lifecycle owner is resolved from an explicit `owner_id`, else the structlog
correlation context (`agent_id` for per-agent, `task_id` for per-task). The
per-call degradation below is a per-invocation safety fallback, not a change
of the configured default: when a reuse strategy cannot derive an owner for a
given call, that single call degrades to ephemeral per-call behaviour while
the configured strategy stays in force for calls that can resolve an owner. `AgentEngineExecutionService` releases the owner at the
task boundary (per-task destroys immediately; per-agent starts the grace
timer so a subsequent task for the same agent within the window re-acquires
the warm container); `DockerSandbox.cleanup()` destroys all strategy-owned
containers via `cleanup_all()`. Containers carry the `synthorg.managed=true`
label so the reconciliation pass reclaims any orphaned on an unclean exit.
Comment thread
coderabbitai[bot] marked this conversation as resolved.

## Git Clone SSRF Prevention

Expand Down
1 change: 1 addition & 0 deletions scripts/_ghost_wiring_manifest.txt
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,4 @@ ENFORCED build_autonomy_change_strategy #1957 -- built at boot in app_builders,
PENDING BaselineStore #1959 -- construct at boot (window from budget.baseline_window_size)
PENDING CoordinationMetricsCollector #1959 -- construct at boot, thread into execution
ENFORCED IntakeEngine #1961 -- wired at boot via client/runtime_builder.build_client_simulation_runtime
ENFORCED create_lifecycle_strategy #1965 -- constructed at boot in workers/runtime_builder._build_tool_registry and injected into the Docker sandbox backend
38 changes: 36 additions & 2 deletions scripts/git-hooks/_run-hook.sh
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,40 @@ shift

ROOT="$(git rev-parse --show-toplevel)"
export UV_FROZEN=1
exec uv run --frozen --project "$ROOT" python -m pre_commit hook-impl \

# Durable full-output log. A bare `git commit`/`git push` routes the
# entire pre-commit/pre-push stream (the affected-pytest dot output
# alone runs tens of KB) through whatever invoked git; terminals and
# tool output caps truncate that, and the actual failure -- the pytest
# summary or mypy error, which lands at the very END -- scrolls off, so
# the run looks like it produced no diagnostic. Teeing every byte to a
# file in the git dir means the complete output is ALWAYS recoverable
# regardless of any caller's truncation, and on failure the short
# failing tail is re-emitted to stderr so even a clipped terminal shows
# the actionable signal. The log lives under the git dir (per-worktree
# via --git-path), never the working tree, so it cannot dirty
# pre-commit's "files were modified by this hook" check.
LOG_DIR="$(git rev-parse --git-path synthorg-hooks)"
mkdir -p "$LOG_DIR"
LOG="$LOG_DIR/${hook_type}-last.log"

set +e
uv run --frozen --project "$ROOT" python -m pre_commit hook-impl \
--config=.pre-commit-config.yaml --hook-type="$hook_type" \
--hook-dir "$ROOT/scripts/git-hooks" -- "$@"
--hook-dir "$ROOT/scripts/git-hooks" -- "$@" 2>&1 | tee "$LOG"
status=${PIPESTATUS[0]}
set -e

if [ "$status" -ne 0 ]; then
{
echo
echo "=================================================================="
echo "git ${hook_type} hook FAILED (exit ${status})."
echo "Full untruncated output: ${LOG}"
echo "--- failing tail (last 60 lines; read the full log above if needed) ---"
tail -n 60 "$LOG"
echo "=================================================================="
} >&2
fi

exit "$status"
25 changes: 19 additions & 6 deletions scripts/run_affected_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,10 +83,23 @@ class _GitError(Exception):
"""Raised when a required git command fails."""


def _git(*args: str) -> str:
"""Run a git command and return stripped stdout.

Raises ``_GitError`` on non-zero exit so callers fail closed.
def _git(*args: str, strip: bool = True) -> str:
"""Run a git command and return its stdout.

Args:
args: Git argv tokens.
strip: When ``True`` (default) the whole stdout blob is
``str.strip()``-ed for convenience. Callers parsing
``--porcelain`` output MUST pass ``strip=False``: porcelain
v1 status codes are two columns and the first column is a
space for worktree-only modifications (`` M path``).
Stripping the blob eats that leading space on the first
line, shifting every fixed-index slice by one (so
``scripts/...`` parses as ``cripts/...`` and the subsequent
``git restore`` fails on a bogus pathspec).

Raises:
_GitError: On non-zero exit so callers fail closed.
"""
result = subprocess.run(
["git", *args],
Expand All @@ -98,7 +111,7 @@ def _git(*args: str) -> str:
if result.returncode != 0:
msg = f"git {' '.join(args)} failed: {result.stderr.strip()}"
raise _GitError(msg)
return result.stdout.strip()
return result.stdout.strip() if strip else result.stdout


def _merge_base() -> str:
Expand Down Expand Up @@ -839,7 +852,7 @@ def _tracked_dirty_paths() -> set[str]:
this guard's job. Renames (``R``) carry ``orig -> new``; both sides
are recorded so a hook-induced rename is fully reconciled.
"""
porcelain = _git("status", "--porcelain")
porcelain = _git("status", "--porcelain", strip=False)
paths: set[str] = set()
for line in porcelain.splitlines():
if not line or line.startswith("??"):
Expand Down
2 changes: 2 additions & 0 deletions src/synthorg/observability/events/docker.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,5 @@
DOCKER_CLEANUP: Final[str] = "docker.cleanup"
DOCKER_HEALTH_CHECK: Final[str] = "docker.health_check"
DOCKER_DAEMON_UNAVAILABLE: Final[str] = "docker.daemon.unavailable"
DOCKER_EXEC_STREAM_CLOSE_FAILED: Final[str] = "docker.exec.stream_close_failed"
DOCKER_EXEC_INSPECT_FAILED: Final[str] = "docker.exec.inspect_failed"
7 changes: 7 additions & 0 deletions src/synthorg/observability/events/sandbox.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,10 @@
SANDBOX_LIFECYCLE_CLEANUP: Final[str] = "sandbox.lifecycle.cleanup"
SANDBOX_LIFECYCLE_IDLE_EXPIRED: Final[str] = "sandbox.lifecycle.idle.expired"
SANDBOX_LIFECYCLE_DESTROY_FAILED: Final[str] = "sandbox.lifecycle.destroy_failed"
SANDBOX_LIFECYCLE_DISPATCH: Final[str] = "sandbox.lifecycle.dispatch"
SANDBOX_LIFECYCLE_OWNER_DEGRADED: Final[str] = "sandbox.lifecycle.owner.degraded"
SANDBOX_CONTAINER_TRACK_FAILED: Final[str] = "sandbox.container.track.failed"
SANDBOX_CONTAINER_UNTRACK_FAILED: Final[str] = "sandbox.container.untrack.failed"
SANDBOX_CONTAINER_LOGS_COLLECT_FAILED: Final[str] = (
"sandbox.container.logs.collect.failed"
)
6 changes: 6 additions & 0 deletions src/synthorg/observability/events/workers.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,3 +80,9 @@
"workers.execution_service.no_provider"
)
WORKERS_EXECUTION_SERVICE_FAILED: Final[str] = "workers.execution_service.failed"
WORKERS_EXECUTION_SERVICE_SANDBOX_RELEASED: Final[str] = (
"workers.execution_service.sandbox_released"
)
WORKERS_EXECUTION_SERVICE_SANDBOX_RELEASE_FAILED: Final[str] = (
"workers.execution_service.sandbox_release_failed"
)
Loading
Loading