Skip to content
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ src/synthorg/
subscribers/ # Concrete settings subscribers (ProviderSettingsSubscriber — rebuilds ModelRouter on strategy change, MemorySettingsSubscriber — advisory logging for memory config)
security/ # SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
templates/ # Pre-built company templates, personality presets, and builder
tools/ # Tool registry, built-in tools (file_system/, git, sandbox/, code_runner), git clone SSRF prevention (git_url_validator), MCP bridge (mcp/), role-based access, approval tool (request_human_approval), tool factory (build_default_tools, build_default_tools_from_config)
tools/ # Tool registry, built-in tools (file_system/, git, sandbox/, code_runner), git clone SSRF prevention (git_url_validator), MCP bridge (mcp/), role-based access, approval tool (request_human_approval), tool factory (build_default_tools, build_default_tools_from_config), sandbox factory (sandbox/factory.py: build_sandbox_backends, resolve_sandbox_for_category, cleanup_sandbox_backends -- per-category backend selection from SandboxingConfig)

web/ # Vue 3 + PrimeVue + Tailwind CSS dashboard
src/
Expand Down
2 changes: 1 addition & 1 deletion docs/architecture/tech-stack.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ These conventions are used throughout the codebase. For full details on each, se
| **Parallel tool execution** | Adopted | `asyncio.TaskGroup` in `ToolInvoker.invoke_all` with optional `max_concurrency` semaphore and structured error collection. |
| **Parallel agent execution** | Adopted | `ParallelExecutor` with `TaskGroup` + `Semaphore` concurrency limits, `ResourceLock` for exclusive file-path claims, progress tracking, and shutdown awareness. |
| **Tool permission checking** | Adopted | Category-level gating based on `ToolAccessLevel`. Priority-based resolution: denied list, allowed list, level categories, then deny. |
| **Tool sandboxing** | Adopted | Layered: in-process path validation for file system tools, `SubprocessSandbox` for git tools, `DockerSandbox` planned for code execution. |
| **Tool sandboxing** | Adopted | Layered: in-process path validation for file system tools, `SubprocessSandbox` for git tools, `DockerSandbox` for code execution. Per-category backend selection via `SandboxingConfig` and sandbox factory. |
| **Crash recovery** | Adopted | Pluggable `RecoveryStrategy` protocol. Current: `FailAndReassignStrategy`. Planned: `CheckpointStrategy` for per-turn state persistence. |
| **Personality compatibility** | Adopted | Weighted composite scoring: 60% Big Five similarity, 20% collaboration alignment, 20% conflict approach. |
| **Agent behavior testing** | Planned | Scripted `FakeProvider` for unit tests; behavioral outcome assertions for integration tests. |
Expand Down
7 changes: 7 additions & 0 deletions docs/design/operations.md
Original file line number Diff line number Diff line change
Expand Up @@ -463,6 +463,13 @@ isolation for high-risk tools.
network_policy: "deny-all" # default deny, allowlist per tool
```

Per-category backend selection is implemented in `tools/sandbox/factory.py` via three functions:
`build_sandbox_backends` (instantiates only the backends referenced by config),
`resolve_sandbox_for_category` (looks up the correct backend for a `ToolCategory`), and
`cleanup_sandbox_backends` (parallel cleanup with error isolation). The tool factory
(`build_default_tools_from_config`) wires `VERSION_CONTROL` category; other categories will
be wired as their tool builders are added.

Docker is optional -- only required when code execution, terminal, web, or database tools are
enabled. File system and git tools work out of the box with subprocess isolation. This keeps
the local-first experience lightweight while providing strong isolation where it matters.
Expand Down
6 changes: 6 additions & 0 deletions src/synthorg/observability/events/sandbox.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,9 @@
SANDBOX_HEALTH_CHECK: Final[str] = "sandbox.health_check"
SANDBOX_KILL_FAILED: Final[str] = "sandbox.kill.failed"
SANDBOX_KILL_FALLBACK: Final[str] = "sandbox.kill.fallback"
SANDBOX_FACTORY_BUILT: Final[str] = "sandbox.factory.built"
SANDBOX_FACTORY_BUILD_FAILED: Final[str] = "sandbox.factory.built.failed"
SANDBOX_FACTORY_RESOLVE: Final[str] = "sandbox.factory.resolve"
SANDBOX_FACTORY_RESOLVE_FAILED: Final[str] = "sandbox.factory.resolve.failed"
SANDBOX_FACTORY_CLEANUP: Final[str] = "sandbox.factory.cleanup"
SANDBOX_FACTORY_CLEANUP_FAILED: Final[str] = "sandbox.factory.cleanup.failed"
68 changes: 63 additions & 5 deletions src/synthorg/tools/factory.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""Tool factory instantiate built-in workspace tools with config-driven parameters.
"""Tool factory -- instantiate built-in workspace tools with config-driven parameters.

Provides ``build_default_tools`` (core factory) and
``build_default_tools_from_config`` (convenience wrapper that
Expand All @@ -9,6 +9,7 @@

from typing import TYPE_CHECKING

from synthorg.core.enums import ToolCategory
from synthorg.observability import get_logger
from synthorg.observability.events.tool import (
TOOL_FACTORY_BUILT,
Expand All @@ -30,8 +31,13 @@
GitLogTool,
GitStatusTool,
)
from synthorg.tools.sandbox.factory import (
build_sandbox_backends,
resolve_sandbox_for_category,
)

if TYPE_CHECKING:
from collections.abc import Mapping
from pathlib import Path

from synthorg.config.schema import RootConfig
Expand Down Expand Up @@ -127,35 +133,87 @@ def build_default_tools(
return result


def _resolve_vc_sandbox(
*,
config: RootConfig,
sandbox_backends: Mapping[str, SandboxBackend] | None,
workspace: Path,
) -> SandboxBackend:
"""Resolve the sandbox backend for the VERSION_CONTROL category.

Builds backends from config when *sandbox_backends* is ``None``.
"""
if sandbox_backends is None:
sandbox_backends = build_sandbox_backends(
config=config.sandboxing,
workspace=workspace,
)
return resolve_sandbox_for_category(
config=config.sandboxing,
backends=sandbox_backends,
category=ToolCategory.VERSION_CONTROL,
)
Comment on lines +146 to +155
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid instantiating unrelated backends in the VERSION_CONTROL-only path.

When sandbox_backends is None, Line 147 builds all configured backends (default + every override), but this code path only consumes ToolCategory.VERSION_CONTROL. That can eagerly construct unused backends (notably Docker), adding avoidable startup failures/dependencies for git-only tool builds.

♻️ Proposed fix
 def _resolve_vc_sandbox(
     *,
     config: RootConfig,
     sandbox_backends: Mapping[str, SandboxBackend] | None,
     workspace: Path,
 ) -> SandboxBackend:
@@
-    if sandbox_backends is None:
-        sandbox_backends = build_sandbox_backends(
-            config=config.sandboxing,
-            workspace=workspace,
-        )
+    if sandbox_backends is None:
+        vc_backend_name = config.sandboxing.backend_for_category(
+            ToolCategory.VERSION_CONTROL.value,
+        )
+        vc_only_config = config.sandboxing.model_copy(
+            update={
+                "default_backend": vc_backend_name,
+                "overrides": {},
+            },
+        )
+        sandbox_backends = build_sandbox_backends(
+            config=vc_only_config,
+            workspace=workspace,
+        )

As per coding guidelines, "If implementation deviates from the design spec, alert the user and explain why — user decides whether to proceed or update the spec."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/tools/factory.py` around lines 146 - 155, The current code
eagerly calls build_sandbox_backends when sandbox_backends is None, which
constructs all configured backends (including Docker) even though only
ToolCategory.VERSION_CONTROL is needed; change the flow to only build or resolve
the backends required for VERSION_CONTROL: either pass sandbox_backends=None
into resolve_sandbox_for_category so it lazily constructs only the requested
backend, or update/build a new helper (e.g., build_sandbox_backends_for_category
or extend build_sandbox_backends with a categories/categories_allowed parameter)
and call that with category=ToolCategory.VERSION_CONTROL (keeping
config=config.sandboxing and workspace=workspace) before calling
resolve_sandbox_for_category; update references to sandbox_backends,
build_sandbox_backends, resolve_sandbox_for_category, and
ToolCategory.VERSION_CONTROL accordingly.



def build_default_tools_from_config(
*,
workspace: Path,
config: RootConfig,
sandbox: SandboxBackend | None = None,
sandbox_backends: Mapping[str, SandboxBackend] | None = None,
) -> tuple[BaseTool, ...]:
"""Build default tools using parameters from a ``RootConfig``.

Convenience wrapper that extracts ``config.git_clone`` and
delegates to :func:`build_default_tools`.
``config.sandboxing`` to resolve per-category sandbox backends.

Currently wires the ``VERSION_CONTROL`` category (git tools).
Other categories (e.g. ``CODE_EXECUTION``) will be wired as
their respective tool builders are added to the factory.

Sandbox resolution priority:
1. Explicit *sandbox* -- backward-compat single backend for all tools.
2. Explicit *sandbox_backends* -- per-category resolution via config.
3. Neither -- auto-build backends from ``config.sandboxing``.

Args:
workspace: Absolute path to the agent workspace root.
config: Validated root configuration.
sandbox: Optional sandbox backend for subprocess
isolation (passed to git tools).
sandbox: Optional single sandbox backend (overrides per-category
resolution when provided).
sandbox_backends: Pre-built mapping of backend name to instance.
When provided, per-category resolution uses this map
instead of auto-building backends.

Returns:
Sorted tuple of ``BaseTool`` instances.

Raises:
ValueError: If *workspace* is not an absolute path.
KeyError: If per-category sandbox resolution finds a backend
name not present in the built or provided backends mapping.
"""
logger.debug(
TOOL_FACTORY_CONFIG_ENTRY,
source="config",
)

if sandbox is not None:
# Explicit single backend -- backward compat
return build_default_tools(
workspace=workspace,
git_clone_policy=config.git_clone,
sandbox=sandbox,
)

vc_sandbox = _resolve_vc_sandbox(
config=config,
sandbox_backends=sandbox_backends,
workspace=workspace,
)

return build_default_tools(
workspace=workspace,
git_clone_policy=config.git_clone,
sandbox=sandbox,
sandbox=vc_sandbox,
)
8 changes: 8 additions & 0 deletions src/synthorg/tools/sandbox/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@
from .docker_config import DockerSandboxConfig
from .docker_sandbox import DockerSandbox
from .errors import SandboxError, SandboxStartError, SandboxTimeoutError
from .factory import (
build_sandbox_backends,
cleanup_sandbox_backends,
resolve_sandbox_for_category,
)
from .protocol import SandboxBackend
from .result import SandboxResult
from .sandboxing_config import SandboxingConfig
Expand All @@ -20,4 +25,7 @@
"SandboxingConfig",
"SubprocessSandbox",
"SubprocessSandboxConfig",
"build_sandbox_backends",
"cleanup_sandbox_backends",
"resolve_sandbox_for_category",
]
5 changes: 2 additions & 3 deletions src/synthorg/tools/sandbox/docker_sandbox.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
import aiodocker
import aiodocker.containers

from synthorg.core.types import NotBlankStr
from synthorg.observability import get_logger
from synthorg.observability.events.docker import (
DOCKER_CLEANUP,
Expand All @@ -35,8 +36,6 @@
if TYPE_CHECKING:
from collections.abc import Mapping

from synthorg.core.types import NotBlankStr

logger = get_logger(__name__)

_DEFAULT_CONFIG = DockerSandboxConfig()
Expand Down Expand Up @@ -645,4 +644,4 @@ async def health_check(self) -> bool:

def get_backend_type(self) -> NotBlankStr:
"""Return ``'docker'``."""
return "docker"
return NotBlankStr("docker")
Comment on lines 645 to +647
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== NotBlankStr definition =="
rg -n -C2 '^\s*NotBlankStr\s*=' src/synthorg/core/types.py

echo
echo "== get_backend_type implementations =="
rg -n -C2 '\bdef get_backend_type\s*\(' src/synthorg/tools/sandbox

echo
echo "== Constructor-style NotBlankStr usage =="
rg -n '\bNotBlankStr\s*\(' src/synthorg

Repository: Aureliolo/synthorg

Length of output: 9129


Return a plain string instead of calling NotBlankStr as a constructor.

NotBlankStr is a type alias (Annotated[str, ...]), not a callable type. While NotBlankStr("docker") won't fail at runtime, it's incorrect usage. Instead, return the plain string:

def get_backend_type(self) -> NotBlankStr:
    """Return ``'docker'``."""
    return "docker"  # type: ignore[return-value]

Or use cast for clarity:

from typing import cast
return cast(NotBlankStr, "docker")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/tools/sandbox/docker_sandbox.py` around lines 645 - 647, The
get_backend_type function currently constructs NotBlankStr("docker") incorrectly
(NotBlankStr is a type alias/Annotated[str,...]); change it to return a plain
string instead — replace NotBlankStr("docker") with "docker" and annotate the
return to satisfy typing (e.g. add a type ignore comment like `# type:
ignore[return-value]` or use typing.cast(NotBlankStr, "docker")) so
get_backend_type returns a plain str while preserving the NotBlankStr return
annotation.

Loading
Loading