Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ src/ai_company/
providers/ # LLM provider abstraction (LiteLLM adapter)
security/ # SecOps agent, approval gates, audit
templates/ # Pre-built company templates, personality presets, and builder
tools/ # Tool registry, built-in tools (file_system/, git, sandbox/), MCP integration, role-based access
tools/ # Tool registry, built-in tools (file_system/, git, sandbox/, code_runner), MCP bridge (mcp/), role-based access
```

## Shell Usage
Expand All @@ -83,7 +83,7 @@ src/ai_company/
- **Every module** with business logic MUST have: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`
- **Never** use `import logging` / `logging.getLogger()` / `print()` in application code
- **Variable name**: always `logger` (not `_logger`, not `log`)
- **Event names**: always use constants from the domain-specific module under `ai_company.observability.events` (e.g. `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`, `CFO_ANOMALY_DETECTED` from `events.cfo`, `CONFLICT_DETECTED` from `events.conflict`, `MEETING_STARTED` from `events.meeting`, `CLASSIFICATION_START` from `events.classification`, `CONSOLIDATION_START` from `events.consolidation`, `ORG_MEMORY_QUERY_START` from `events.org_memory`, `API_REQUEST_STARTED` from `events.api`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`
- **Event names**: always use constants from the domain-specific module under `ai_company.observability.events` (e.g. `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`, `CFO_ANOMALY_DETECTED` from `events.cfo`, `CONFLICT_DETECTED` from `events.conflict`, `MEETING_STARTED` from `events.meeting`, `CLASSIFICATION_START` from `events.classification`, `CONSOLIDATION_START` from `events.consolidation`, `ORG_MEMORY_QUERY_START` from `events.org_memory`, `API_REQUEST_STARTED` from `events.api`, `CODE_RUNNER_EXECUTE_START` from `events.code_runner`, `DOCKER_EXECUTE_START` from `events.docker`, `MCP_INVOKE_START` from `events.mcp`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`
- **Structured kwargs**: always `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`
- **All error paths** must log at WARNING or ERROR with context before raising
- **All state transitions** must log at INFO
Expand Down
26 changes: 21 additions & 5 deletions DESIGN_SPEC.md
Original file line number Diff line number Diff line change
Expand Up @@ -2101,7 +2101,7 @@ Tool execution requires safety boundaries proportional to the risk of each tool
| Backend | Isolation | Latency | Dependencies | Status |
|---------|-----------|---------|--------------|--------|
| `SubprocessSandbox` | Process-level: env filtering (allowlist + denylist), restricted PATH (configurable via `extra_safe_path_prefixes`), workspace-scoped cwd, timeout + process-group kill, library injection var blocking, explicit transport cleanup on Windows | ~ms | None | **Implemented** |
| `DockerSandbox` | Container-level: ephemeral container, mounted workspace, no network, resource limits (CPU/memory/time) | ~1-2s cold start | Docker | Planned |
| `DockerSandbox` | Container-level: ephemeral container, mounted workspace, no network, resource limits (CPU/memory/time) | ~1-2s cold start | Docker | **Implemented** |
| `K8sSandbox` | Pod-level: per-agent containers, namespace isolation, resource quotas, network policies | ~2-5s | Kubernetes | Future |

#### Default Layered Configuration
Expand Down Expand Up @@ -2728,7 +2728,8 @@ Circular inheritance is detected via chain tracking and raises `TemplateInherita
| **Web UI** | Vue 3 + Vite | Modern, fast, good ecosystem. Simpler than React for dashboards |
| **Real-time** | WebSocket (Litestar channels plugin) | Built-in pub/sub broadcasting, per-channel history, backpressure management. Real-time agent activity, task updates, chat feed |
| **Containerization** | Docker + Docker Compose | Isolated code execution, reproducible environments |
| **Tool Integration** | MCP (Model Context Protocol) | Industry standard for LLM-to-tool integration |
| **Docker API** | aiodocker | Async-native Docker API client for `DockerSandbox` backend |
| **Tool Integration** | MCP SDK (`mcp`) | Industry standard for LLM-to-tool integration |
| **Agent Comms** | A2A Protocol compatible | Future-proof inter-agent communication |
| **Config Format** | YAML + Pydantic validation | Human-readable config with strict validation |
| **CLI** | TBD (future, if needed) | Thin wrapper around the REST API for terminal use. May not be needed — interactive Scalar docs at `/docs/api` and `curl`/`httpie` may suffice |
Expand Down Expand Up @@ -2960,7 +2961,10 @@ ai-company/
│ │ │ ├── task_routing.py # TASK_ROUTING_* constants
│ │ │ ├── template.py # TEMPLATE_* constants
│ │ │ ├── tool.py # TOOL_* constants
│ │ │ └── workspace.py # WORKSPACE_* constants
│ │ │ ├── workspace.py # WORKSPACE_* constants
│ │ │ ├── code_runner.py # CODE_RUNNER_* constants
│ │ │ ├── docker.py # DOCKER_* constants
│ │ │ └── mcp.py # MCP_* constants
│ │ ├── processors.py # Log processors
│ │ ├── setup.py # Logging setup
│ │ └── sinks.py # Log output backends
Expand Down Expand Up @@ -3015,9 +3019,21 @@ ai-company/
│ │ ├── _git_base.py # Base class for git tools (workspace, subprocess, sandbox integration)
│ │ ├── _process_cleanup.py # Subprocess transport cleanup utility (Windows ResourceWarning prevention)
│ │ ├── git_tools.py # Git operations — 6 built-in tools (sandbox-aware)
│ │ ├── code_runner.py # Code execution (M7)
│ │ ├── docker_config.py # Docker sandbox configuration
│ │ ├── docker_sandbox.py # DockerSandbox backend (aiodocker)
│ │ ├── sandboxing_config.py # Top-level sandboxing config (backend selection)
│ │ ├── code_runner.py # Code execution tool
│ │ ├── web_tools.py # HTTP, search (M7)
│ │ └── mcp_bridge.py # MCP server integration (M7)
│ │ └── mcp/ # MCP bridge subpackage
│ │ ├── __init__.py # Package exports
│ │ ├── bridge_tool.py # McpBridgeTool (BaseTool integration)
│ │ ├── cache.py # Tool schema caching

Copilot AI Mar 10, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DESIGN_SPEC.md describes cache.py as "Tool schema caching" but the actual module provides MCP tool result caching (TTL+LRU), not schema caching. The comment should read something like "MCP result cache (TTL + LRU)".

Copilot uses AI. Check for mistakes.
│ │ ├── client.py # MCP client wrapper
│ │ ├── config.py # MCP server/bridge config models
│ │ ├── errors.py # MCP error hierarchy
│ │ ├── factory.py # McpBridgeTool factory

Copilot AI Mar 10, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DESIGN_SPEC.md uses McpBridgeTool (Pascal case mixing) in the directory tree comments (lines 3029 and 3034), but the actual class is named MCPBridgeTool. These documentation comments in the spec are incorrect and should be updated to MCPBridgeTool.

Copilot uses AI. Check for mistakes.
│ │ ├── models.py # MCP domain models
│ │ └── result_mapper.py # MCP result → ToolExecutionResult mapping
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated
│ ├── security/ # Security & approval (M7, stubs only)
│ │ ├── approval.py # Approval workflow gates (M7) — domain model is in core/approval.py
│ │ ├── secops_agent.py # Security operations agent (M7)
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ AI Company lets you spin up a virtual organization staffed entirely by AI agents
- **Company Config + Core Models** - Strong Pydantic validation, immutable config models, runtime state models
- **Provider Layer** - LiteLLM-based provider abstraction with routing, retry, and rate limiting
- **Budget Tracking** - Cost records, summaries, and coordination analytics models
- **Tool System** - File system tools, git tools, sandbox abstraction, permission gating
- **Tool System** - File system tools, git tools, sandbox abstraction (subprocess + Docker), code runner, MCP bridge, permission gating
Comment thread
coderabbitai[bot] marked this conversation as resolved.
- **Single-Agent Engine (M3)** - ReAct/Plan-Execute loops, fail-and-reassign recovery, graceful shutdown
- **Multi-Agent Core (M4)** - Message bus, delegation with loop prevention, conflict resolution, meeting protocols
- **Task Intelligence (M4)** - Task decomposition, routing, assignment strategies, workspace isolation via git worktrees
Expand All @@ -38,7 +38,7 @@ AI Company lets you spin up a virtual organization staffed entirely by AI agents

## Status

**M7: Security & HR** next (M0–M6 all done). See [DESIGN_SPEC.md](DESIGN_SPEC.md) for the full high-level specification.
**M7: Security & HR** in progress (M0–M6 all done). See [DESIGN_SPEC.md](DESIGN_SPEC.md) for the full high-level specification.

## Tech Stack

Expand All @@ -47,7 +47,7 @@ AI Company lets you spin up a virtual organization staffed entirely by AI agents
- **LiteLLM** for multi-provider LLM abstraction
- **structlog** for structured logging and observability
- **Mem0** for agent memory (initial backend; custom stack future — see [ADR-001](docs/decisions/ADR-001-memory-layer.md))
- **MCP** for tool integration (planned)
- **MCP** for tool integration
- **Vue 3** for web dashboard (planned)
- **SQLite** (aiosqlite) → PostgreSQL for operational data persistence

Expand Down
19 changes: 19 additions & 0 deletions docker/sandbox/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
FROM node:22-slim AS node-base

FROM python:3.14-slim

COPY --from=node-base /usr/local/bin/node /usr/local/bin/node
COPY --from=node-base /usr/local/lib/node_modules /usr/local/lib/node_modules
RUN ln -s /usr/local/lib/node_modules/npm/bin/npm-cli.js /usr/local/bin/npm

RUN apt-get update && apt-get install -y --no-install-recommends git \
&& apt-get clean && rm -rf /var/lib/apt/lists/*

Comment on lines +9 to +11

Copilot AI Mar 10, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Dockerfile does not set ReadOnly mode or drop capabilities at the image layer — those are applied at runtime by DockerSandbox._build_container_config. This is correct design. However, the Dockerfile installs git (line 9) and copies the full Node.js installation including npm. Because ReadonlyRootfs: True is applied at runtime, the sandbox container cannot write to the filesystem — but git and npm may try to write to /home, /tmp, or other paths that are blocked by the read-only rootfs. If agents need to run git or npm inside the sandbox, they will fail silently or with confusing errors. Consider documenting this constraint, or mounting a writable /tmp via tmpfs in the container config to allow tools that need temporary write access.

Copilot uses AI. Check for mistakes.
RUN mkdir -p /workspace \
&& useradd --uid 65534 --gid 65534 --no-create-home --shell /usr/sbin/nologin sandbox
Comment thread
greptile-apps[bot] marked this conversation as resolved.
Outdated

WORKDIR /workspace

USER sandbox

CMD ["bash"]
10 changes: 10 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,13 @@ classifiers = [
"Typing :: Typed",
]
dependencies = [
"aiodocker==0.26.0",
"aiosqlite==0.21.0",
"jinja2==3.1.6",
"jsonschema==4.26.0",
"litellm==1.82.0",
"litestar[standard,structlog,pydantic,brotli,prometheus]==2.21.1",
"mcp==1.26.0",
"pydantic==2.12.5",
"pyyaml==6.0.3",
"structlog==25.5.0",
Expand Down Expand Up @@ -157,10 +159,18 @@ ignore_missing_imports = true
module = "jsonschema.*"
ignore_missing_imports = true

[[tool.mypy.overrides]]
module = "aiodocker.*"
ignore_missing_imports = true

[[tool.mypy.overrides]]
module = "aiosqlite.*"
ignore_missing_imports = true

[[tool.mypy.overrides]]
module = "mcp.*"
ignore_missing_imports = true

[[tool.mypy.overrides]]
module = "litestar.*"
ignore_missing_imports = true
Expand Down
2 changes: 2 additions & 0 deletions src/ai_company/config/defaults.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,6 @@ def default_config_dict() -> dict[str, Any]:
"cost_tiers": {},
"org_memory": {},
"api": {},
"sandboxing": {},
"mcp": {},
}
12 changes: 12 additions & 0 deletions src/ai_company/config/schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@
from ai_company.observability.config import LogConfig # noqa: TC001
from ai_company.observability.events.config import CONFIG_VALIDATION_FAILED
from ai_company.persistence.config import PersistenceConfig
from ai_company.tools.mcp.config import MCPConfig
from ai_company.tools.sandbox.sandboxing_config import SandboxingConfig

logger = get_logger(__name__)

Expand Down Expand Up @@ -487,6 +489,8 @@ class RootConfig(BaseModel):
cost_tiers: Cost tier definitions.
org_memory: Organizational memory configuration.
api: API server configuration.
sandboxing: Sandboxing backend configuration.
mcp: MCP bridge configuration.
"""

model_config = ConfigDict(frozen=True)
Expand Down Expand Up @@ -574,6 +578,14 @@ class RootConfig(BaseModel):
default_factory=ApiConfig,
description="API server configuration",
)
sandboxing: SandboxingConfig = Field(
default_factory=SandboxingConfig,
description="Sandboxing backend configuration",
)
mcp: MCPConfig = Field(
default_factory=MCPConfig,
description="MCP bridge configuration",
)

@model_validator(mode="after")
def _validate_unique_agent_names(self) -> Self:
Expand Down
8 changes: 8 additions & 0 deletions src/ai_company/observability/events/code_runner.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
"""Code runner tool event constants."""

from typing import Final

CODE_RUNNER_EXECUTE_START: Final[str] = "code_runner.execute.start"
CODE_RUNNER_EXECUTE_SUCCESS: Final[str] = "code_runner.execute.success"
CODE_RUNNER_EXECUTE_FAILED: Final[str] = "code_runner.execute.failed"
CODE_RUNNER_INVALID_LANGUAGE: Final[str] = "code_runner.invalid_language"
16 changes: 16 additions & 0 deletions src/ai_company/observability/events/docker.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
"""Docker sandbox event constants."""

from typing import Final

DOCKER_EXECUTE_START: Final[str] = "docker.execute.start"
DOCKER_EXECUTE_SUCCESS: Final[str] = "docker.execute.success"
DOCKER_EXECUTE_FAILED: Final[str] = "docker.execute.failed"
DOCKER_EXECUTE_TIMEOUT: Final[str] = "docker.execute.timeout"
DOCKER_CONTAINER_CREATED: Final[str] = "docker.container.created"
DOCKER_CONTAINER_STOPPED: Final[str] = "docker.container.stopped"
DOCKER_CONTAINER_REMOVED: Final[str] = "docker.container.removed"
DOCKER_CONTAINER_STOP_FAILED: Final[str] = "docker.container.stop_failed"
DOCKER_CONTAINER_REMOVE_FAILED: Final[str] = "docker.container.remove_failed"
DOCKER_CLEANUP: Final[str] = "docker.cleanup"
DOCKER_HEALTH_CHECK: Final[str] = "docker.health_check"
DOCKER_DAEMON_UNAVAILABLE: Final[str] = "docker.daemon.unavailable"
27 changes: 27 additions & 0 deletions src/ai_company/observability/events/mcp.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
"""MCP bridge event constants."""

from typing import Final

MCP_CLIENT_CONNECTING: Final[str] = "mcp.client.connecting"
MCP_CLIENT_CONNECTED: Final[str] = "mcp.client.connected"
MCP_CLIENT_DISCONNECTED: Final[str] = "mcp.client.disconnected"
MCP_CLIENT_RECONNECTING: Final[str] = "mcp.client.reconnecting"
MCP_CLIENT_CONNECTION_FAILED: Final[str] = "mcp.client.connection_failed"
MCP_DISCOVERY_START: Final[str] = "mcp.discovery.start"
MCP_DISCOVERY_COMPLETE: Final[str] = "mcp.discovery.complete"
MCP_DISCOVERY_FAILED: Final[str] = "mcp.discovery.failed"
MCP_DISCOVERY_FILTERED: Final[str] = "mcp.discovery.filtered"
MCP_INVOKE_START: Final[str] = "mcp.invoke.start"
MCP_INVOKE_SUCCESS: Final[str] = "mcp.invoke.success"
MCP_INVOKE_FAILED: Final[str] = "mcp.invoke.failed"
MCP_INVOKE_TIMEOUT: Final[str] = "mcp.invoke.timeout"
MCP_RESULT_MAPPED: Final[str] = "mcp.result.mapped"
MCP_RESULT_ATTACHMENT: Final[str] = "mcp.result.attachment"
MCP_CACHE_HIT: Final[str] = "mcp.cache.hit"
MCP_CACHE_MISS: Final[str] = "mcp.cache.miss"
MCP_CACHE_EVICT: Final[str] = "mcp.cache.evict"
MCP_CONFIG_VALIDATION_FAILED: Final[str] = "mcp.config.validation_failed"
MCP_CLIENT_DISCONNECT_FAILED: Final[str] = "mcp.client.disconnect_failed"
MCP_FACTORY_START: Final[str] = "mcp.factory.start"
MCP_FACTORY_COMPLETE: Final[str] = "mcp.factory.complete"
MCP_FACTORY_SERVER_SKIPPED: Final[str] = "mcp.factory.server_skipped"
11 changes: 11 additions & 0 deletions src/ai_company/tools/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""Tool system — base abstraction, registry, invoker, permissions, and errors."""

from .base import BaseTool, ToolExecutionResult
from .code_runner import CodeRunnerTool
from .errors import (
ToolError,
ToolExecutionError,
Expand Down Expand Up @@ -30,19 +31,28 @@
from .permissions import ToolPermissionChecker
from .registry import ToolRegistry
from .sandbox import (
DockerSandbox,
DockerSandboxConfig,
SandboxBackend,
SandboxError,
SandboxingConfig,
SandboxResult,
SandboxStartError,
SandboxTimeoutError,
SubprocessSandbox,
SubprocessSandboxConfig,
)

# MCP types are re-exported from ai_company.tools.mcp to avoid
# circular imports (config.schema -> tools.mcp -> tools.base).

__all__ = [
"BaseFileSystemTool",
"BaseTool",
"CodeRunnerTool",
"DeleteFileTool",
"DockerSandbox",
"DockerSandboxConfig",
"EchoTool",
"EditFileTool",
"GitBranchTool",
Expand All @@ -59,6 +69,7 @@
"SandboxResult",
"SandboxStartError",
"SandboxTimeoutError",
"SandboxingConfig",
"SubprocessSandbox",
"SubprocessSandboxConfig",
"ToolError",
Expand Down
Loading
Loading