diff --git a/.github/workflows/dependency-review.yml b/.github/workflows/dependency-review.yml index 4ababae0e8..6dfc60af0a 100644 --- a/.github/workflows/dependency-review.yml +++ b/.github/workflows/dependency-review.yml @@ -24,8 +24,11 @@ jobs: fail-on-severity: high # LicenseRef-scancode-free-unknown: aiosqlite 0.21.0 — MIT per classifiers, scancode misdetects # Python-2.0.1: editorconfig 0.17.1 (via jsbeautifier via litestar[standard]) + # MIT-0: cffi 2.0.0 — permissive (MIT variant, no attribution required) + # LicenseRef-scancode-free-unknown: aiosqlite 0.21.0, aiodocker 0.26.0, + # pycparser 3.0, sse-starlette 3.3.2 — MIT per classifiers, scancode misdetects allow-licenses: >- - MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, + MIT, MIT-0, Apache-2.0, BSD-2-Clause, BSD-3-Clause, ISC, MPL-2.0, PSF-2.0, Unlicense, 0BSD, CC0-1.0, Python-2.0, Python-2.0.1, LicenseRef-scancode-free-unknown diff --git a/CLAUDE.md b/CLAUDE.md index c69754574c..b347b8d1e3 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -56,7 +56,7 @@ src/ai_company/ providers/ # LLM provider abstraction (LiteLLM adapter) security/ # SecOps agent, approval gates, audit templates/ # Pre-built company templates, personality presets, and builder - tools/ # Tool registry, built-in tools (file_system/, git, sandbox/), MCP integration, role-based access + tools/ # Tool registry, built-in tools (file_system/, git, sandbox/, code_runner), MCP bridge (mcp/), role-based access ``` ## Shell Usage @@ -83,7 +83,7 @@ src/ai_company/ - **Every module** with business logic MUST have: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)` - **Never** use `import logging` / `logging.getLogger()` / `print()` in application code - **Variable name**: always `logger` (not `_logger`, not `log`) -- **Event names**: always use constants from the domain-specific module under `ai_company.observability.events` (e.g. `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`, `CFO_ANOMALY_DETECTED` from `events.cfo`, `CONFLICT_DETECTED` from `events.conflict`, `MEETING_STARTED` from `events.meeting`, `CLASSIFICATION_START` from `events.classification`, `CONSOLIDATION_START` from `events.consolidation`, `ORG_MEMORY_QUERY_START` from `events.org_memory`, `API_REQUEST_STARTED` from `events.api`). Import directly: `from ai_company.observability.events. import EVENT_CONSTANT` +- **Event names**: always use constants from the domain-specific module under `ai_company.observability.events` (e.g. `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`, `CFO_ANOMALY_DETECTED` from `events.cfo`, `CONFLICT_DETECTED` from `events.conflict`, `MEETING_STARTED` from `events.meeting`, `CLASSIFICATION_START` from `events.classification`, `CONSOLIDATION_START` from `events.consolidation`, `ORG_MEMORY_QUERY_START` from `events.org_memory`, `API_REQUEST_STARTED` from `events.api`, `CODE_RUNNER_EXECUTE_START` from `events.code_runner`, `DOCKER_EXECUTE_START` from `events.docker`, `MCP_INVOKE_START` from `events.mcp`). Import directly: `from ai_company.observability.events. import EVENT_CONSTANT` - **Structured kwargs**: always `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)` - **All error paths** must log at WARNING or ERROR with context before raising - **All state transitions** must log at INFO diff --git a/DESIGN_SPEC.md b/DESIGN_SPEC.md index 511d918d78..0f96ac8104 100644 --- a/DESIGN_SPEC.md +++ b/DESIGN_SPEC.md @@ -81,7 +81,7 @@ The MVP validates the core hypothesis: **a single agent can complete a real task > **Implementation snapshot (2026-03-09):** > - **Done:** M0–M6 (tooling, config/core, providers, single-agent engine, multi-agent orchestration, API/CLI surface). Memory layer backend selected ([ADR-001](docs/decisions/ADR-001-memory-layer.md)). Persistence backend (§7.6) completed. Memory retrieval pipeline (#41: ranking, token-budget formatting, context injection) complete. Budget enforcement complete (BudgetEnforcer + configurable cost tiers + quota/subscription tracking). CFO cost optimization complete (CostOptimizer: anomaly detection, efficiency analysis, downgrade recommendations, routing optimization, approval decisions; ReportGenerator: multi-dimensional spending reports). Shared org memory (#125: HybridPromptRetrievalBackend, OrgFactStore, access control, factory) complete. Memory consolidation/archival (#48: ConsolidationService, SimpleConsolidationStrategy, RetentionEnforcer, ArchivalStore protocol) complete. -> - **Not started (mostly placeholders):** M7 security + approval system. +> - **In progress:** M7 — Docker sandbox (#50), MCP bridge (#53), code runner implemented. Security + approval system not started. ### 1.5 Configuration Philosophy @@ -2101,7 +2101,7 @@ Tool execution requires safety boundaries proportional to the risk of each tool | Backend | Isolation | Latency | Dependencies | Status | |---------|-----------|---------|--------------|--------| | `SubprocessSandbox` | Process-level: env filtering (allowlist + denylist), restricted PATH (configurable via `extra_safe_path_prefixes`), workspace-scoped cwd, timeout + process-group kill, library injection var blocking, explicit transport cleanup on Windows | ~ms | None | **Implemented** | -| `DockerSandbox` | Container-level: ephemeral container, mounted workspace, no network, resource limits (CPU/memory/time) | ~1-2s cold start | Docker | Planned | +| `DockerSandbox` | Container-level: ephemeral container, mounted workspace, no network, resource limits (CPU/memory/time) | ~1-2s cold start | Docker | **Implemented** | | `K8sSandbox` | Pod-level: per-agent containers, namespace isolation, resource quotas, network policies | ~2-5s | Kubernetes | Future | #### Default Layered Configuration @@ -2130,7 +2130,7 @@ sandboxing: memory_limit: "512m" cpu_limit: "1.0" timeout_seconds: 120 - mount_mode: "rw" # rw for workspace dir, nothing else mounted + mount_mode: "ro" # read-only by default; workspace mounted separately auto_remove: true # ephemeral — container removed after execution k8s: # future — per-agent pod isolation namespace: "ai-company-agents" @@ -2151,7 +2151,7 @@ sandboxing: > **Decisions ([ADR-002](docs/decisions/ADR-002-design-decisions-batch-1.md) D17, D18):** > -> - **D17 — MCP SDK:** Official `mcp` Python SDK, pinned `>=1.25,<2`. Thin `MCPBridgeTool` adapter layer isolates the rest of the codebase from SDK API changes. Support **stdio** (local/dev) and **Streamable HTTP** (remote/production) transports. Skip deprecated SSE. v2 migration planned — pin range prevents accidental breaking upgrade. +> - **D17 — MCP SDK:** Official `mcp` Python SDK, pinned `==1.26.0`. Thin `MCPBridgeTool` adapter layer isolates the rest of the codebase from SDK API changes. Support **stdio** (local/dev) and **Streamable HTTP** (remote/production) transports. Skip deprecated SSE. v2 migration planned — pin range prevents accidental breaking upgrade. > - **D18 — MCP Result Mapping:** Adapter in `MCPBridgeTool` keeps `ToolResult` as-is. Mapping: text blocks → concatenate to `content: str`; image/audio → `[image: {mimeType}]` placeholder + base64 in `metadata["attachments"]`; `structuredContent` → `metadata["structured_content"]`; `isError` → `is_error` (1:1). Future: extend `ToolResult` with optional `attachments` when multi-modal LLM tool results are needed. ### 11.1.4 Action Type System @@ -2728,7 +2728,8 @@ Circular inheritance is detected via chain tracking and raises `TemplateInherita | **Web UI** | Vue 3 + Vite | Modern, fast, good ecosystem. Simpler than React for dashboards | | **Real-time** | WebSocket (Litestar channels plugin) | Built-in pub/sub broadcasting, per-channel history, backpressure management. Real-time agent activity, task updates, chat feed | | **Containerization** | Docker + Docker Compose | Isolated code execution, reproducible environments | -| **Tool Integration** | MCP (Model Context Protocol) | Industry standard for LLM-to-tool integration | +| **Docker API** | aiodocker | Async-native Docker API client for `DockerSandbox` backend | +| **Tool Integration** | MCP SDK (`mcp`) | Industry standard for LLM-to-tool integration | | **Agent Comms** | A2A Protocol compatible | Future-proof inter-agent communication | | **Config Format** | YAML + Pydantic validation | Human-readable config with strict validation | | **CLI** | TBD (future, if needed) | Thin wrapper around the REST API for terminal use. May not be needed — interactive Scalar docs at `/docs/api` and `curl`/`httpie` may suffice | @@ -2960,7 +2961,10 @@ ai-company/ │ │ │ ├── task_routing.py # TASK_ROUTING_* constants │ │ │ ├── template.py # TEMPLATE_* constants │ │ │ ├── tool.py # TOOL_* constants -│ │ │ └── workspace.py # WORKSPACE_* constants +│ │ │ ├── workspace.py # WORKSPACE_* constants +│ │ │ ├── code_runner.py # CODE_RUNNER_* constants +│ │ │ ├── docker.py # DOCKER_* constants +│ │ │ └── mcp.py # MCP_* constants │ │ ├── processors.py # Log processors │ │ ├── setup.py # Logging setup │ │ └── sinks.py # Log output backends @@ -2996,13 +3000,6 @@ ai-company/ │ │ ├── examples/ # Example tool implementations │ │ │ ├── __init__.py # Package exports │ │ │ └── echo.py # Echo tool (for testing) -│ │ ├── sandbox/ # Sandboxing backends -│ │ │ ├── __init__.py # Package exports -│ │ │ ├── config.py # SubprocessSandboxConfig model -│ │ │ ├── errors.py # SandboxError hierarchy -│ │ │ ├── protocol.py # SandboxBackend protocol -│ │ │ ├── result.py # SandboxResult model -│ │ │ └── subprocess_sandbox.py # SubprocessSandbox (default) │ │ ├── file_system/ # Built-in file system tools │ │ │ ├── __init__.py # Package exports │ │ │ ├── _base_fs_tool.py # BaseFileSystemTool ABC @@ -3015,9 +3012,28 @@ ai-company/ │ │ ├── _git_base.py # Base class for git tools (workspace, subprocess, sandbox integration) │ │ ├── _process_cleanup.py # Subprocess transport cleanup utility (Windows ResourceWarning prevention) │ │ ├── git_tools.py # Git operations — 6 built-in tools (sandbox-aware) -│ │ ├── code_runner.py # Code execution (M7) +│ │ ├── code_runner.py # Code execution tool │ │ ├── web_tools.py # HTTP, search (M7) -│ │ └── mcp_bridge.py # MCP server integration (M7) +│ │ ├── sandbox/ # Sandbox backends subpackage +│ │ │ ├── __init__.py # Package exports +│ │ │ ├── config.py # Subprocess sandbox configuration +│ │ │ ├── docker_config.py # Docker sandbox configuration +│ │ │ ├── docker_sandbox.py # DockerSandbox backend (aiodocker) +│ │ │ ├── errors.py # Sandbox error hierarchy +│ │ │ ├── protocol.py # SandboxBackend protocol +│ │ │ ├── result.py # SandboxResult model +│ │ │ ├── sandboxing_config.py # Top-level sandboxing config +│ │ │ └── subprocess_sandbox.py # SubprocessSandbox backend +│ │ └── mcp/ # MCP bridge subpackage +│ │ ├── __init__.py # Package exports +│ │ ├── bridge_tool.py # MCPBridgeTool (BaseTool integration) +│ │ ├── cache.py # MCP result cache (TTL + LRU) +│ │ ├── client.py # MCP client wrapper +│ │ ├── config.py # MCP server/bridge config models +│ │ ├── errors.py # MCP error hierarchy +│ │ ├── factory.py # MCPToolFactory (parallel connect) +│ │ ├── models.py # MCP domain models +│ │ └── result_mapper.py # MCP result → ToolExecutionResult mapping │ ├── security/ # Security & approval (M7, stubs only) │ │ ├── approval.py # Approval workflow gates (M7) — domain model is in core/approval.py │ │ ├── secops_agent.py # Security operations agent (M7) diff --git a/README.md b/README.md index 35b49d88f8..35cc74c4b1 100644 --- a/README.md +++ b/README.md @@ -15,7 +15,7 @@ AI Company lets you spin up a virtual organization staffed entirely by AI agents - **Company Config + Core Models** - Strong Pydantic validation, immutable config models, runtime state models - **Provider Layer** - LiteLLM-based provider abstraction with routing, retry, and rate limiting - **Budget Tracking** - Cost records, summaries, and coordination analytics models -- **Tool System** - File system tools, git tools, sandbox abstraction, permission gating +- **Tool System** - File system tools, git tools, sandbox abstraction (subprocess + Docker), code runner, MCP bridge, permission gating - **Single-Agent Engine (M3)** - ReAct/Plan-Execute loops, fail-and-reassign recovery, graceful shutdown - **Multi-Agent Core (M4)** - Message bus, delegation with loop prevention, conflict resolution, meeting protocols - **Task Intelligence (M4)** - Task decomposition, routing, assignment strategies, workspace isolation via git worktrees @@ -38,7 +38,7 @@ AI Company lets you spin up a virtual organization staffed entirely by AI agents ## Status -**M7: Security & HR** next (M0–M6 all done). See [DESIGN_SPEC.md](DESIGN_SPEC.md) for the full high-level specification. +**M7: Security & HR** in progress (M0–M6 all done). See [DESIGN_SPEC.md](DESIGN_SPEC.md) for the full high-level specification. ## Tech Stack @@ -47,7 +47,7 @@ AI Company lets you spin up a virtual organization staffed entirely by AI agents - **LiteLLM** for multi-provider LLM abstraction - **structlog** for structured logging and observability - **Mem0** for agent memory (initial backend; custom stack future — see [ADR-001](docs/decisions/ADR-001-memory-layer.md)) -- **MCP** for tool integration (planned) +- **MCP** for tool integration - **Vue 3** for web dashboard (planned) - **SQLite** (aiosqlite) → PostgreSQL for operational data persistence @@ -56,6 +56,7 @@ AI Company lets you spin up a virtual organization staffed entirely by AI agents - **Python 3.14+** - **uv** — package manager ([install](https://docs.astral.sh/uv/getting-started/installation/)) - **Git 2.x+** — required at runtime for built-in git tools (subprocess-based, not a Python binding) +- **Docker** (optional) — required for code execution sandbox and Docker-backed tool isolation. Install [Docker Desktop](https://docs.docker.com/get-docker/) or Docker Engine. File system and git tools work without Docker via subprocess isolation. ## Getting Started diff --git a/docker/sandbox/Dockerfile b/docker/sandbox/Dockerfile new file mode 100644 index 0000000000..fead8ae4e1 --- /dev/null +++ b/docker/sandbox/Dockerfile @@ -0,0 +1,19 @@ +FROM node:22-slim AS node-base + +FROM python:3.14-slim + +COPY --from=node-base /usr/local/bin/node /usr/local/bin/node +COPY --from=node-base /usr/local/lib/node_modules /usr/local/lib/node_modules +RUN ln -s /usr/local/lib/node_modules/npm/bin/npm-cli.js /usr/local/bin/npm + +RUN apt-get update && apt-get install -y --no-install-recommends git \ + && apt-get clean && rm -rf /var/lib/apt/lists/* + +RUN mkdir -p /workspace \ + && useradd --uid 10001 --no-create-home --shell /usr/sbin/nologin sandbox + +WORKDIR /workspace + +USER sandbox + +CMD ["bash"] diff --git a/pyproject.toml b/pyproject.toml index 9be06ae362..f9d86fad8c 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -13,11 +13,13 @@ classifiers = [ "Typing :: Typed", ] dependencies = [ + "aiodocker==0.26.0", "aiosqlite==0.21.0", "jinja2==3.1.6", "jsonschema==4.26.0", "litellm==1.82.0", "litestar[standard,structlog,pydantic,brotli,prometheus]==2.21.1", + "mcp==1.26.0", "pydantic==2.12.5", "pyyaml==6.0.3", "structlog==25.5.0", @@ -157,10 +159,18 @@ ignore_missing_imports = true module = "jsonschema.*" ignore_missing_imports = true +[[tool.mypy.overrides]] +module = "aiodocker.*" +ignore_missing_imports = true + [[tool.mypy.overrides]] module = "aiosqlite.*" ignore_missing_imports = true +[[tool.mypy.overrides]] +module = "mcp.*" +ignore_missing_imports = true + [[tool.mypy.overrides]] module = "litestar.*" ignore_missing_imports = true diff --git a/src/ai_company/config/defaults.py b/src/ai_company/config/defaults.py index 1639afa699..25e59d55c5 100644 --- a/src/ai_company/config/defaults.py +++ b/src/ai_company/config/defaults.py @@ -35,4 +35,6 @@ def default_config_dict() -> dict[str, Any]: "cost_tiers": {}, "org_memory": {}, "api": {}, + "sandboxing": {}, + "mcp": {}, } diff --git a/src/ai_company/config/schema.py b/src/ai_company/config/schema.py index c7f88fb17f..6d8c6baaed 100644 --- a/src/ai_company/config/schema.py +++ b/src/ai_company/config/schema.py @@ -26,6 +26,8 @@ from ai_company.observability.config import LogConfig # noqa: TC001 from ai_company.observability.events.config import CONFIG_VALIDATION_FAILED from ai_company.persistence.config import PersistenceConfig +from ai_company.tools.mcp.config import MCPConfig +from ai_company.tools.sandbox.sandboxing_config import SandboxingConfig logger = get_logger(__name__) @@ -487,6 +489,8 @@ class RootConfig(BaseModel): cost_tiers: Cost tier definitions. org_memory: Organizational memory configuration. api: API server configuration. + sandboxing: Sandboxing backend configuration. + mcp: MCP bridge configuration. """ model_config = ConfigDict(frozen=True) @@ -574,6 +578,14 @@ class RootConfig(BaseModel): default_factory=ApiConfig, description="API server configuration", ) + sandboxing: SandboxingConfig = Field( + default_factory=SandboxingConfig, + description="Sandboxing backend configuration", + ) + mcp: MCPConfig = Field( + default_factory=MCPConfig, + description="MCP bridge configuration", + ) @model_validator(mode="after") def _validate_unique_agent_names(self) -> Self: diff --git a/src/ai_company/observability/events/code_runner.py b/src/ai_company/observability/events/code_runner.py new file mode 100644 index 0000000000..7b0a4cb329 --- /dev/null +++ b/src/ai_company/observability/events/code_runner.py @@ -0,0 +1,8 @@ +"""Code runner tool event constants.""" + +from typing import Final + +CODE_RUNNER_EXECUTE_START: Final[str] = "code_runner.execute.start" +CODE_RUNNER_EXECUTE_SUCCESS: Final[str] = "code_runner.execute.success" +CODE_RUNNER_EXECUTE_FAILED: Final[str] = "code_runner.execute.failed" +CODE_RUNNER_INVALID_LANGUAGE: Final[str] = "code_runner.invalid_language" diff --git a/src/ai_company/observability/events/docker.py b/src/ai_company/observability/events/docker.py new file mode 100644 index 0000000000..1ceabdcf7b --- /dev/null +++ b/src/ai_company/observability/events/docker.py @@ -0,0 +1,16 @@ +"""Docker sandbox event constants.""" + +from typing import Final + +DOCKER_EXECUTE_START: Final[str] = "docker.execute.start" +DOCKER_EXECUTE_SUCCESS: Final[str] = "docker.execute.success" +DOCKER_EXECUTE_FAILED: Final[str] = "docker.execute.failed" +DOCKER_EXECUTE_TIMEOUT: Final[str] = "docker.execute.timeout" +DOCKER_CONTAINER_CREATED: Final[str] = "docker.container.created" +DOCKER_CONTAINER_STOPPED: Final[str] = "docker.container.stopped" +DOCKER_CONTAINER_REMOVED: Final[str] = "docker.container.removed" +DOCKER_CONTAINER_STOP_FAILED: Final[str] = "docker.container.stop_failed" +DOCKER_CONTAINER_REMOVE_FAILED: Final[str] = "docker.container.remove_failed" +DOCKER_CLEANUP: Final[str] = "docker.cleanup" +DOCKER_HEALTH_CHECK: Final[str] = "docker.health_check" +DOCKER_DAEMON_UNAVAILABLE: Final[str] = "docker.daemon.unavailable" diff --git a/src/ai_company/observability/events/mcp.py b/src/ai_company/observability/events/mcp.py new file mode 100644 index 0000000000..c70b88f49f --- /dev/null +++ b/src/ai_company/observability/events/mcp.py @@ -0,0 +1,31 @@ +"""MCP bridge event constants.""" + +from typing import Final + +MCP_CLIENT_CONNECTING: Final[str] = "mcp.client.connecting" +MCP_CLIENT_CONNECTED: Final[str] = "mcp.client.connected" +MCP_CLIENT_DISCONNECTED: Final[str] = "mcp.client.disconnected" +MCP_CLIENT_RECONNECTING: Final[str] = "mcp.client.reconnecting" +MCP_CLIENT_CONNECTION_FAILED: Final[str] = "mcp.client.connection_failed" +MCP_DISCOVERY_START: Final[str] = "mcp.discovery.start" +MCP_DISCOVERY_COMPLETE: Final[str] = "mcp.discovery.complete" +MCP_DISCOVERY_FAILED: Final[str] = "mcp.discovery.failed" +MCP_DISCOVERY_FILTERED: Final[str] = "mcp.discovery.filtered" +MCP_INVOKE_START: Final[str] = "mcp.invoke.start" +MCP_INVOKE_SUCCESS: Final[str] = "mcp.invoke.success" +MCP_INVOKE_FAILED: Final[str] = "mcp.invoke.failed" +MCP_INVOKE_TIMEOUT: Final[str] = "mcp.invoke.timeout" +MCP_RESULT_MAPPED: Final[str] = "mcp.result.mapped" +MCP_RESULT_UNKNOWN_BLOCK: Final[str] = "mcp.result.unknown_block" +MCP_FACTORY_REUSE_REJECTED: Final[str] = "mcp.factory.reuse_rejected" +MCP_RESULT_ATTACHMENT: Final[str] = "mcp.result.attachment" +MCP_CACHE_HIT: Final[str] = "mcp.cache.hit" +MCP_CACHE_MISS: Final[str] = "mcp.cache.miss" +MCP_CACHE_EVICT: Final[str] = "mcp.cache.evict" +MCP_CONFIG_VALIDATION_FAILED: Final[str] = "mcp.config.validation_failed" +MCP_CLIENT_DISCONNECT_FAILED: Final[str] = "mcp.client.disconnect_failed" +MCP_FACTORY_START: Final[str] = "mcp.factory.start" +MCP_FACTORY_COMPLETE: Final[str] = "mcp.factory.complete" +MCP_FACTORY_SERVER_SKIPPED: Final[str] = "mcp.factory.server_skipped" +MCP_CACHE_STORE_FAILED: Final[str] = "mcp.cache.store_failed" +MCP_FACTORY_CLEANUP: Final[str] = "mcp.factory.cleanup" diff --git a/src/ai_company/tools/__init__.py b/src/ai_company/tools/__init__.py index b71e9bd9c4..08255eb4b3 100644 --- a/src/ai_company/tools/__init__.py +++ b/src/ai_company/tools/__init__.py @@ -1,6 +1,7 @@ """Tool system — base abstraction, registry, invoker, permissions, and errors.""" from .base import BaseTool, ToolExecutionResult +from .code_runner import CodeRunnerTool from .errors import ( ToolError, ToolExecutionError, @@ -30,8 +31,11 @@ from .permissions import ToolPermissionChecker from .registry import ToolRegistry from .sandbox import ( + DockerSandbox, + DockerSandboxConfig, SandboxBackend, SandboxError, + SandboxingConfig, SandboxResult, SandboxStartError, SandboxTimeoutError, @@ -39,10 +43,16 @@ SubprocessSandboxConfig, ) +# MCP types are re-exported from ai_company.tools.mcp to avoid +# circular imports (config.schema -> tools.mcp -> tools.base). + __all__ = [ "BaseFileSystemTool", "BaseTool", + "CodeRunnerTool", "DeleteFileTool", + "DockerSandbox", + "DockerSandboxConfig", "EchoTool", "EditFileTool", "GitBranchTool", @@ -59,6 +69,7 @@ "SandboxResult", "SandboxStartError", "SandboxTimeoutError", + "SandboxingConfig", "SubprocessSandbox", "SubprocessSandboxConfig", "ToolError", diff --git a/src/ai_company/tools/code_runner.py b/src/ai_company/tools/code_runner.py new file mode 100644 index 0000000000..a78deec686 --- /dev/null +++ b/src/ai_company/tools/code_runner.py @@ -0,0 +1,164 @@ +"""Code runner tool — executes code snippets in a sandboxed environment. + +Supports Python, JavaScript, and Bash via configurable sandbox backends. +""" + +from typing import TYPE_CHECKING, Any, Final + +from ai_company.core.enums import ToolCategory +from ai_company.observability import get_logger +from ai_company.observability.events.code_runner import ( + CODE_RUNNER_EXECUTE_FAILED, + CODE_RUNNER_EXECUTE_START, + CODE_RUNNER_EXECUTE_SUCCESS, + CODE_RUNNER_INVALID_LANGUAGE, +) +from ai_company.tools.base import BaseTool, ToolExecutionResult +from ai_company.tools.sandbox.errors import SandboxError + +if TYPE_CHECKING: + from ai_company.tools.sandbox.protocol import SandboxBackend + +logger = get_logger(__name__) + +_LANGUAGE_COMMANDS: Final[dict[str, tuple[str, str]]] = { + "python": ("python3", "-c"), + "javascript": ("node", "-e"), + "bash": ("bash", "-c"), +} + +_PARAMETERS_SCHEMA: Final[dict[str, Any]] = { + "type": "object", + "properties": { + "code": { + "type": "string", + "description": "Source code to execute", + }, + "language": { + "type": "string", + "enum": ["python", "javascript", "bash"], + "description": "Programming language of the code", + }, + "timeout": { + "type": "number", + "description": "Optional timeout in seconds", + "minimum": 0, + "maximum": 600, + }, + }, + "required": ["code", "language"], + "additionalProperties": False, +} + + +class CodeRunnerTool(BaseTool): + """Executes code snippets in a sandboxed environment. + + Supports Python, JavaScript, and Bash. Delegates execution to + a ``SandboxBackend`` for isolation and resource control. + """ + + def __init__(self, *, sandbox: SandboxBackend) -> None: + """Initialize the code runner tool. + + Args: + sandbox: Sandbox backend for isolated code execution. + """ + super().__init__( + name="code_runner", + description=( + "Executes code snippets in Python, JavaScript, " + "or Bash within a sandboxed environment" + ), + category=ToolCategory.CODE_EXECUTION, + parameters_schema=dict(_PARAMETERS_SCHEMA), + ) + self._sandbox = sandbox + + async def execute( + self, + *, + arguments: dict[str, Any], + ) -> ToolExecutionResult: + """Execute a code snippet in the sandbox. + + Args: + arguments: Must contain ``code`` (str), ``language`` (str), + and optionally ``timeout`` (float). + + Returns: + A ``ToolExecutionResult`` with execution output. + """ + code: str = arguments["code"] + language: str = arguments["language"] + timeout: float | None = arguments.get("timeout") + + if language not in _LANGUAGE_COMMANDS: + logger.warning( + CODE_RUNNER_INVALID_LANGUAGE, + language=language, + ) + return ToolExecutionResult( + content=f"Unsupported language: {language!r}. " + f"Supported: {sorted(_LANGUAGE_COMMANDS)}", + is_error=True, + ) + + command, flag = _LANGUAGE_COMMANDS[language] + + logger.debug( + CODE_RUNNER_EXECUTE_START, + language=language, + timeout=timeout, + code_length=len(code), + ) + + try: + result = await self._sandbox.execute( + command=command, + args=(flag, code), + timeout=timeout, + ) + except SandboxError as exc: + logger.warning( + CODE_RUNNER_EXECUTE_FAILED, + language=language, + error=str(exc), + ) + return ToolExecutionResult( + content=f"Sandbox error: {exc}", + is_error=True, + metadata={"language": language}, + ) + + if result.success: + logger.debug( + CODE_RUNNER_EXECUTE_SUCCESS, + language=language, + ) + return ToolExecutionResult( + content=result.stdout or "(no output)", + metadata={ + "returncode": result.returncode, + "language": language, + }, + ) + + logger.warning( + CODE_RUNNER_EXECUTE_FAILED, + language=language, + returncode=result.returncode, + timed_out=result.timed_out, + ) + error_msg = result.stderr or result.stdout or "Execution failed" + if result.timed_out: + error_msg = f"Execution timed out. {error_msg}" + return ToolExecutionResult( + content=error_msg, + is_error=True, + metadata={ + "returncode": result.returncode, + "timed_out": result.timed_out, + "language": language, + }, + ) diff --git a/src/ai_company/tools/mcp/__init__.py b/src/ai_company/tools/mcp/__init__.py new file mode 100644 index 0000000000..ce4917d3a3 --- /dev/null +++ b/src/ai_company/tools/mcp/__init__.py @@ -0,0 +1,61 @@ +"""MCP bridge — connects external MCP servers as internal tools. + +Re-exports from submodules use lazy ``__getattr__`` to avoid circular +imports. Config models and errors are imported eagerly since they have +no dependency on the tool base classes. +""" + +from .config import MCPConfig, MCPServerConfig +from .errors import ( + MCPConnectionError, + MCPDiscoveryError, + MCPError, + MCPInvocationError, + MCPTimeoutError, +) +from .models import MCPRawResult, MCPToolInfo + +__all__ = [ + "MCPBridgeTool", + "MCPClient", + "MCPConfig", + "MCPConnectionError", + "MCPDiscoveryError", + "MCPError", + "MCPInvocationError", + "MCPRawResult", + "MCPResultCache", + "MCPServerConfig", + "MCPTimeoutError", + "MCPToolFactory", + "MCPToolInfo", + "map_call_tool_result", +] + +# Lazy imports for types that depend on tools.base / MCP SDK +# to break the circular import chain. +_LAZY_IMPORTS: dict[str, tuple[str, str]] = { + "MCPBridgeTool": (".bridge_tool", "MCPBridgeTool"), + "MCPClient": (".client", "MCPClient"), + "MCPResultCache": (".cache", "MCPResultCache"), + "MCPToolFactory": (".factory", "MCPToolFactory"), + "map_call_tool_result": ( + ".result_mapper", + "map_call_tool_result", + ), +} + + +def __getattr__(name: str) -> object: + """Lazily import heavy modules on first access.""" + if name in _LAZY_IMPORTS: + module_path, attr_name = _LAZY_IMPORTS[name] + import importlib # noqa: PLC0415 + + module = importlib.import_module(module_path, __package__) + value = getattr(module, attr_name) + # Cache on the module dict to avoid repeated lookups + globals()[name] = value + return value + msg = f"module {__name__!r} has no attribute {name!r}" + raise AttributeError(msg) diff --git a/src/ai_company/tools/mcp/bridge_tool.py b/src/ai_company/tools/mcp/bridge_tool.py new file mode 100644 index 0000000000..d4bb477297 --- /dev/null +++ b/src/ai_company/tools/mcp/bridge_tool.py @@ -0,0 +1,188 @@ +"""MCP bridge tool — wraps an MCP server tool as a ``BaseTool``. + +Each ``MCPBridgeTool`` instance represents a single tool discovered +from an MCP server, bridging MCP protocol calls into the internal +tool system. +""" + +from typing import TYPE_CHECKING, Any + +from ai_company.core.enums import ToolCategory +from ai_company.observability import get_logger +from ai_company.observability.events.mcp import ( + MCP_CACHE_HIT, + MCP_CACHE_MISS, + MCP_CACHE_STORE_FAILED, + MCP_INVOKE_FAILED, + MCP_INVOKE_START, +) +from ai_company.tools.base import BaseTool, ToolExecutionResult +from ai_company.tools.mcp.errors import MCPError +from ai_company.tools.mcp.result_mapper import map_call_tool_result + +if TYPE_CHECKING: + from ai_company.tools.mcp.cache import MCPResultCache + from ai_company.tools.mcp.client import MCPClient + from ai_company.tools.mcp.models import MCPToolInfo + +logger = get_logger(__name__) + + +class MCPBridgeTool(BaseTool): + """Bridge between an MCP server tool and the internal tool system. + + Constructs a ``BaseTool`` whose ``execute`` delegates to an MCP + server via ``MCPClient``. An optional ``MCPResultCache`` avoids + redundant remote calls for identical invocations. + + Args: + tool_info: Discovered MCP tool metadata. + client: Connected MCP client for the server. + cache: Optional result cache. + """ + + def __init__( + self, + *, + tool_info: MCPToolInfo, + client: MCPClient, + cache: MCPResultCache | None = None, + ) -> None: + super().__init__( + name=f"mcp_{tool_info.server_name}_{tool_info.name}", + description=tool_info.description, + parameters_schema=tool_info.input_schema or None, + category=ToolCategory.MCP, + ) + self._client = client + self._tool_info = tool_info + self._cache = cache + + @property + def tool_info(self) -> MCPToolInfo: + """The underlying MCP tool metadata.""" + return self._tool_info + + async def execute( + self, + *, + arguments: dict[str, Any], + ) -> ToolExecutionResult: + """Execute the MCP tool via the client. + + Checks the cache first (if available). On cache miss, + invokes the remote tool and stores the result. + + Args: + arguments: Tool invocation arguments. + + Returns: + Mapped ``ToolExecutionResult``. + """ + cached = self._check_cache(arguments) + if cached is not None: + return cached + + result = await self._invoke(arguments) + self._store_in_cache(arguments, result) + return result + + def _check_cache( + self, + arguments: dict[str, Any], + ) -> ToolExecutionResult | None: + """Look up the cache, returning the result on hit. + + Args: + arguments: Tool invocation arguments. + + Returns: + Cached result or ``None``. + """ + if self._cache is None: + return None + try: + cached = self._cache.get( + self._tool_info.name, + arguments, + ) + except TypeError: + logger.debug( + MCP_CACHE_MISS, + tool_name=self._tool_info.name, + server=self._tool_info.server_name, + reason="unhashable arguments", + ) + return None + if cached is not None: + logger.debug( + MCP_CACHE_HIT, + tool_name=self._tool_info.name, + server=self._tool_info.server_name, + ) + return cached + + async def _invoke( + self, + arguments: dict[str, Any], + ) -> ToolExecutionResult: + """Call the remote MCP tool and map the result. + + Args: + arguments: Tool invocation arguments. + + Returns: + Mapped ``ToolExecutionResult``. + """ + logger.debug( + MCP_INVOKE_START, + tool=self._tool_info.name, + server=self._tool_info.server_name, + ) + try: + raw = await self._client.call_tool( + self._tool_info.name, + arguments, + ) + except MCPError as exc: + logger.warning( + MCP_INVOKE_FAILED, + tool=self._tool_info.name, + server=self._tool_info.server_name, + error=str(exc), + ) + return ToolExecutionResult( + content=str(exc), + is_error=True, + ) + return map_call_tool_result(raw) + + def _store_in_cache( + self, + arguments: dict[str, Any], + result: ToolExecutionResult, + ) -> None: + """Store a successful result in the cache. + + Skips caching for error results (to avoid replaying + transient failures) and unhashable arguments. + + Args: + arguments: Tool invocation arguments. + result: The result to cache. + """ + if self._cache is None or result.is_error: + return + try: + self._cache.put( + self._tool_info.name, + arguments, + result, + ) + except TypeError: + logger.debug( + MCP_CACHE_STORE_FAILED, + tool_name=self._tool_info.name, + server=self._tool_info.server_name, + reason="unhashable arguments", + ) diff --git a/src/ai_company/tools/mcp/cache.py b/src/ai_company/tools/mcp/cache.py new file mode 100644 index 0000000000..e1456e4e02 --- /dev/null +++ b/src/ai_company/tools/mcp/cache.py @@ -0,0 +1,170 @@ +"""MCP result cache with TTL and LRU eviction. + +Provides an in-memory cache for MCP tool invocation results to +reduce redundant calls to external MCP servers. +""" + +import copy +import time +from collections import OrderedDict +from typing import Any + +from ai_company.observability import get_logger +from ai_company.observability.events.mcp import ( + MCP_CACHE_EVICT, + MCP_CACHE_HIT, + MCP_CACHE_MISS, +) +from ai_company.tools.base import ToolExecutionResult # noqa: TC001 + +logger = get_logger(__name__) + + +class MCPResultCache: + """TTL + LRU-bounded cache for MCP tool results. + + Safe for use within a single asyncio event loop, where coroutine + interleaving cannot cause concurrent mutations to the cache dict. + Keys are derived from tool name and arguments. + + Args: + max_size: Maximum number of cached entries. + ttl_seconds: Time-to-live for cache entries in seconds. + """ + + def __init__( + self, + *, + max_size: int = 256, + ttl_seconds: float = 60.0, + ) -> None: + self._max_size = max_size + self._ttl_seconds = ttl_seconds + self._cache: OrderedDict[tuple[str, Any], tuple[float, ToolExecutionResult]] = ( + OrderedDict() + ) + + def get( + self, + tool_name: str, + arguments: dict[str, Any], + ) -> ToolExecutionResult | None: + """Look up a cached result. + + Returns ``None`` on cache miss or TTL expiry. On hit, the + entry is moved to the end of the LRU queue. + + Args: + tool_name: MCP tool name. + arguments: Tool invocation arguments. + + Returns: + Cached ``ToolExecutionResult`` or ``None``. + """ + key = self._make_key(tool_name, arguments) + entry = self._cache.get(key) + if entry is None: + logger.debug(MCP_CACHE_MISS, tool_name=tool_name) + return None + + timestamp, result = entry + if time.monotonic() - timestamp > self._ttl_seconds: + del self._cache[key] + logger.debug( + MCP_CACHE_MISS, + tool_name=tool_name, + reason="expired", + ) + return None + + self._cache.move_to_end(key) + logger.debug(MCP_CACHE_HIT, tool_name=tool_name) + return copy.deepcopy(result) + + def put( + self, + tool_name: str, + arguments: dict[str, Any], + result: ToolExecutionResult, + ) -> None: + """Store a result in the cache. + + If the cache is at capacity, the oldest entry is evicted + before insertion. + + Args: + tool_name: MCP tool name. + arguments: Tool invocation arguments. + result: The ``ToolExecutionResult`` to cache. + """ + key = self._make_key(tool_name, arguments) + + # Remove existing entry to refresh position + if key in self._cache: + del self._cache[key] + + # Evict oldest if at capacity + while len(self._cache) >= self._max_size > 0: + evicted_key, _ = self._cache.popitem(last=False) + logger.debug( + MCP_CACHE_EVICT, + evicted_tool=evicted_key[0], + ) + + if self._max_size > 0: + self._cache[key] = (time.monotonic(), copy.deepcopy(result)) + + def invalidate( + self, + tool_name: str | None = None, + ) -> None: + """Invalidate cache entries. + + Args: + tool_name: If provided, only invalidate entries for this + tool. If ``None``, clear all entries. + """ + if tool_name is None: + self._cache.clear() + return + + keys_to_remove = [k for k in self._cache if k[0] == tool_name] + for key in keys_to_remove: + del self._cache[key] + + @staticmethod + def _make_key( + tool_name: str, + arguments: dict[str, Any], + ) -> tuple[str, Any]: + """Build a hashable cache key. + + Args: + tool_name: MCP tool name. + arguments: Tool invocation arguments. + + Returns: + A hashable tuple of (tool_name, frozen_arguments). + """ + return (tool_name, _make_hashable(arguments)) + + +def _make_hashable(obj: Any) -> Any: + """Recursively freeze a value into a hashable form. + + Dicts become frozensets of (key, value) tuples, lists and tuples + become tuples of frozen values, and everything else passes through. + + Args: + obj: Value to freeze. + + Returns: + A hashable representation of *obj*. + """ + if isinstance(obj, dict): + return frozenset((k, _make_hashable(v)) for k, v in sorted(obj.items())) + if isinstance(obj, list | tuple): + return tuple(_make_hashable(item) for item in obj) + if isinstance(obj, set): + return frozenset(_make_hashable(item) for item in obj) + return obj diff --git a/src/ai_company/tools/mcp/client.py b/src/ai_company/tools/mcp/client.py new file mode 100644 index 0000000000..e966e96a2e --- /dev/null +++ b/src/ai_company/tools/mcp/client.py @@ -0,0 +1,485 @@ +"""MCP client — thin async wrapper over the MCP SDK. + +Manages a single connection to an MCP server and provides +tool discovery and invocation through the MCP protocol. +""" + +import asyncio +import copy +from contextlib import AsyncExitStack +from typing import TYPE_CHECKING, Any, Self + +from mcp import ClientSession, StdioServerParameters +from mcp.client.stdio import stdio_client +from mcp.client.streamable_http import streamablehttp_client + +from ai_company.observability import get_logger +from ai_company.observability.events.mcp import ( + MCP_CLIENT_CONNECTED, + MCP_CLIENT_CONNECTING, + MCP_CLIENT_CONNECTION_FAILED, + MCP_CLIENT_DISCONNECT_FAILED, + MCP_CLIENT_DISCONNECTED, + MCP_CLIENT_RECONNECTING, + MCP_DISCOVERY_COMPLETE, + MCP_DISCOVERY_FAILED, + MCP_DISCOVERY_FILTERED, + MCP_DISCOVERY_START, + MCP_INVOKE_FAILED, + MCP_INVOKE_START, + MCP_INVOKE_SUCCESS, + MCP_INVOKE_TIMEOUT, +) +from ai_company.tools.mcp.errors import ( + MCPConnectionError, + MCPDiscoveryError, + MCPInvocationError, + MCPTimeoutError, +) +from ai_company.tools.mcp.models import MCPRawResult, MCPToolInfo + +if TYPE_CHECKING: + from ai_company.tools.mcp.config import MCPServerConfig + +logger = get_logger(__name__) + + +class MCPClient: + """Async client for a single MCP server. + + Wraps the MCP SDK's ``ClientSession`` to provide connection + management, tool discovery, and tool invocation. A lock + serializes all session access to prevent interleaving. + + Args: + config: Server connection configuration. + """ + + def __init__(self, config: MCPServerConfig) -> None: + self._config = config + self._session: ClientSession | None = None + self._exit_stack: AsyncExitStack | None = None + self._lock = asyncio.Lock() + + @property + def config(self) -> MCPServerConfig: + """Server connection configuration (read-only).""" + return self._config + + @property + def is_connected(self) -> bool: + """Whether the client has an active session.""" + return self._session is not None + + @property + def server_name(self) -> str: + """Name of the configured server.""" + return self._config.name + + async def connect(self) -> None: + """Establish a connection to the MCP server. + + Raises: + MCPConnectionError: If the connection fails. + RuntimeError: If already connected. + """ + async with self._lock: + if self._session is not None: + msg = f"Already connected to {self._config.name!r}" + logger.warning( + MCP_CLIENT_CONNECTION_FAILED, + server=self._config.name, + error=msg, + ) + raise RuntimeError(msg) + logger.info( + MCP_CLIENT_CONNECTING, + server=self._config.name, + transport=self._config.transport, + ) + stack = AsyncExitStack() + await stack.__aenter__() + try: + coro = self._connect_with_stack(stack) + session = await asyncio.wait_for( + coro, + timeout=self._config.connect_timeout_seconds, + ) + self._session = session + self._exit_stack = stack + logger.info( + MCP_CLIENT_CONNECTED, + server=self._config.name, + ) + except TimeoutError as exc: + await stack.aclose() + msg = ( + f"Connection to {self._config.name!r} timed out " + f"after {self._config.connect_timeout_seconds}s" + ) + logger.warning( + MCP_CLIENT_CONNECTION_FAILED, + server=self._config.name, + error=msg, + ) + raise MCPConnectionError( + msg, + context={ + "server": self._config.name, + "transport": self._config.transport, + }, + ) from exc + except MCPConnectionError: + await stack.aclose() + raise + except Exception as exc: + await stack.aclose() + logger.exception( + MCP_CLIENT_CONNECTION_FAILED, + server=self._config.name, + error=str(exc), + ) + msg = f"Failed to connect to {self._config.name!r}: {exc}" + raise MCPConnectionError( + msg, + context={ + "server": self._config.name, + "transport": self._config.transport, + }, + ) from exc + except BaseException: + # CancelledError, KeyboardInterrupt — still close the stack + await stack.aclose() + raise + + async def _connect_with_stack( + self, + stack: AsyncExitStack, + ) -> ClientSession: + """Connect via the appropriate transport and initialize. + + Args: + stack: Exit stack for resource management. + + Returns: + Connected and initialized ``ClientSession``. + """ + if self._config.transport == "stdio": + session = await self._connect_stdio(stack) + else: + session = await self._connect_http(stack) + await session.initialize() + return session + + async def disconnect(self) -> None: + """Close the connection and release resources.""" + async with self._lock: + if self._exit_stack is not None: + try: + await self._exit_stack.aclose() + except Exception as exc: + logger.warning( + MCP_CLIENT_DISCONNECT_FAILED, + server=self._config.name, + error=str(exc), + ) + else: + logger.info( + MCP_CLIENT_DISCONNECTED, + server=self._config.name, + ) + finally: + self._session = None + self._exit_stack = None + + async def list_tools(self) -> tuple[MCPToolInfo, ...]: + """Discover tools from the connected server. + + Applies ``enabled_tools`` / ``disabled_tools`` filters + from the server configuration. + + Returns: + Filtered tuple of discovered tool metadata. + + Raises: + MCPDiscoveryError: If discovery fails. + """ + async with self._lock: + session = self._require_session() + logger.info( + MCP_DISCOVERY_START, + server=self._config.name, + ) + try: + result = await session.list_tools() + except Exception as exc: + logger.exception( + MCP_DISCOVERY_FAILED, + server=self._config.name, + error=str(exc), + ) + msg = f"Tool discovery failed for {self._config.name!r}: {exc}" + raise MCPDiscoveryError( + msg, + context={"server": self._config.name}, + ) from exc + + tools = tuple( + MCPToolInfo( + name=t.name, + description=t.description or "", + input_schema=(copy.deepcopy(t.inputSchema) if t.inputSchema else {}), + server_name=self._config.name, + ) + for t in result.tools + ) + + filtered = self._apply_filters(tools) + logger.info( + MCP_DISCOVERY_COMPLETE, + server=self._config.name, + total=len(tools), + after_filter=len(filtered), + ) + return filtered + + async def call_tool( + self, + tool_name: str, + arguments: dict[str, Any], + ) -> MCPRawResult: + """Invoke a tool on the connected server. + + Acquires the session lock to respect MCP's sequential + protocol constraint. Applies the configured timeout. + + Args: + tool_name: Name of the tool to invoke. + arguments: Arguments to pass to the tool. + + Returns: + Raw result from the MCP server. + + Raises: + MCPTimeoutError: If the invocation times out. + MCPInvocationError: If the invocation fails. + """ + logger.debug( + MCP_INVOKE_START, + server=self._config.name, + tool=tool_name, + ) + async with self._lock: + session = self._require_session() + try: + result = await asyncio.wait_for( + session.call_tool(tool_name, arguments), + timeout=self._config.timeout_seconds, + ) + except TimeoutError as exc: + logger.warning( + MCP_INVOKE_TIMEOUT, + server=self._config.name, + tool=tool_name, + timeout=self._config.timeout_seconds, + ) + msg = f"Tool {tool_name!r} timed out on {self._config.name!r}" + raise MCPTimeoutError( + msg, + context={ + "server": self._config.name, + "tool": tool_name, + "timeout": self._config.timeout_seconds, + }, + ) from exc + except Exception as exc: + logger.exception( + MCP_INVOKE_FAILED, + server=self._config.name, + tool=tool_name, + error=str(exc), + ) + msg = f"Tool {tool_name!r} failed on {self._config.name!r}: {exc}" + raise MCPInvocationError( + msg, + context={ + "server": self._config.name, + "tool": tool_name, + }, + ) from exc + + logger.info( + MCP_INVOKE_SUCCESS, + server=self._config.name, + tool=tool_name, + ) + return MCPRawResult( + content=tuple(result.content), + is_error=result.isError or False, + structured_content=( + copy.deepcopy(result.structuredContent) + if result.structuredContent is not None + else None + ), + ) + + async def reconnect(self) -> None: + """Disconnect and reconnect to the server. + + Raises: + MCPConnectionError: If the reconnection fails. + """ + logger.info( + MCP_CLIENT_RECONNECTING, + server=self._config.name, + ) + await self.disconnect() + await self.connect() + + async def __aenter__(self) -> Self: + """Enter async context: connect to server.""" + await self.connect() + return self + + async def __aexit__( + self, + exc_type: type[BaseException] | None, + exc_val: BaseException | None, + exc_tb: object, + ) -> None: + """Exit async context: disconnect from server.""" + await self.disconnect() + + # ── Private helpers ────────────────────────────────────────── + + def _require_session(self) -> ClientSession: + """Return the active session or raise. + + Returns: + The active ``ClientSession``. + + Raises: + MCPConnectionError: If not connected. + """ + if self._session is None: + msg = f"Not connected to {self._config.name!r}" + logger.warning( + MCP_CLIENT_CONNECTION_FAILED, + server=self._config.name, + error=msg, + ) + raise MCPConnectionError( + msg, + context={"server": self._config.name}, + ) + return self._session + + async def _connect_stdio( + self, + stack: AsyncExitStack, + ) -> ClientSession: + """Set up a stdio transport connection. + + Args: + stack: Exit stack for resource management. + + Returns: + Connected ``ClientSession`` (not yet initialized). + """ + if self._config.command is None: + msg = f"Server {self._config.name!r}: stdio transport requires 'command'" + logger.warning( + MCP_CLIENT_CONNECTION_FAILED, + server=self._config.name, + error=msg, + ) + raise MCPConnectionError( + msg, + context={"server": self._config.name}, + ) + params = StdioServerParameters( + command=self._config.command, + args=list(self._config.args), + env=(dict(self._config.env) if self._config.env else None), + ) + read_stream, write_stream = await stack.enter_async_context( + stdio_client(params), + ) + return await stack.enter_async_context( + ClientSession(read_stream, write_stream), + ) + + async def _connect_http( + self, + stack: AsyncExitStack, + ) -> ClientSession: + """Set up a streamable HTTP transport connection. + + Args: + stack: Exit stack for resource management. + + Returns: + Connected ``ClientSession`` (not yet initialized). + """ + if self._config.url is None: + msg = f"Server {self._config.name!r}: streamable_http requires 'url'" + logger.warning( + MCP_CLIENT_CONNECTION_FAILED, + server=self._config.name, + error=msg, + ) + raise MCPConnectionError( + msg, + context={"server": self._config.name}, + ) + read_stream, write_stream, _ = await stack.enter_async_context( + streamablehttp_client( + url=self._config.url, + headers=(dict(self._config.headers) if self._config.headers else None), + ), + ) + return await stack.enter_async_context( + ClientSession(read_stream, write_stream), + ) + + def _apply_filters( + self, + tools: tuple[MCPToolInfo, ...], + ) -> tuple[MCPToolInfo, ...]: + """Apply enabled/disabled tool filters. + + Args: + tools: All discovered tools. + + Returns: + Filtered tool tuple. + """ + result = tools + + if self._config.enabled_tools is not None: + allowed = set(self._config.enabled_tools) + before = len(result) + result = tuple(t for t in result if t.name in allowed) + if len(result) < before: + logger.debug( + MCP_DISCOVERY_FILTERED, + server=self._config.name, + filter_type="enabled_tools", + before=before, + after=len(result), + ) + + if self._config.disabled_tools: + blocked = set(self._config.disabled_tools) + before = len(result) + result = tuple(t for t in result if t.name not in blocked) + if len(result) < before: + logger.debug( + MCP_DISCOVERY_FILTERED, + server=self._config.name, + filter_type="disabled_tools", + before=before, + after=len(result), + ) + + return result diff --git a/src/ai_company/tools/mcp/config.py b/src/ai_company/tools/mcp/config.py new file mode 100644 index 0000000000..0bece496db --- /dev/null +++ b/src/ai_company/tools/mcp/config.py @@ -0,0 +1,176 @@ +"""MCP bridge configuration models. + +Defines ``MCPServerConfig`` for individual MCP server connections and +``MCPConfig`` as the top-level container. Both are frozen Pydantic +models following the project's immutability conventions. +""" + +from collections import Counter +from typing import Literal, Self + +from pydantic import BaseModel, ConfigDict, Field, model_validator + +from ai_company.core.types import NotBlankStr # noqa: TC001 +from ai_company.observability import get_logger +from ai_company.observability.events.mcp import ( + MCP_CONFIG_VALIDATION_FAILED, +) + +logger = get_logger(__name__) + + +class MCPServerConfig(BaseModel): + """Configuration for a single MCP server connection. + + Attributes: + name: Unique server identifier. + transport: Transport type (``"stdio"`` or ``"streamable_http"``). + command: Command to launch a stdio server. + args: Command-line arguments for stdio server. + env: Environment variables for stdio server. + url: URL for streamable HTTP server. + headers: HTTP headers for streamable HTTP server. + enabled_tools: Allowlist of tool names (``None`` = all). + disabled_tools: Denylist of tool names. + timeout_seconds: Timeout for tool invocations. + connect_timeout_seconds: Timeout for initial connection. + result_cache_ttl_seconds: TTL for result cache entries. + result_cache_max_size: Maximum result cache entries. + enabled: Whether the server is active. + """ + + model_config = ConfigDict(frozen=True) + + name: NotBlankStr = Field(description="Unique server identifier") + transport: Literal["stdio", "streamable_http"] = Field( + description="Transport type: stdio or streamable_http", + ) + # stdio fields + command: NotBlankStr | None = Field( + default=None, + description="Command to launch a stdio server", + ) + args: tuple[str, ...] = Field( + default=(), + description="Command-line arguments for stdio server", + ) + env: dict[str, str] = Field( + default_factory=dict, + description="Environment variables for stdio server", + ) + # streamable_http fields + url: NotBlankStr | None = Field( + default=None, + description="URL for streamable HTTP server", + ) + headers: dict[str, str] = Field( + default_factory=dict, + description="HTTP headers for streamable HTTP server", + ) + # Common + enabled_tools: tuple[NotBlankStr, ...] | None = Field( + default=None, + description="Allowlist of tool names (None = all)", + ) + disabled_tools: tuple[NotBlankStr, ...] = Field( + default=(), + description="Denylist of tool names", + ) + timeout_seconds: float = Field( + default=30.0, + gt=0, + le=600, + description="Timeout for tool invocations in seconds", + ) + connect_timeout_seconds: float = Field( + default=10.0, + gt=0, + le=120, + description="Timeout for initial connection in seconds", + ) + result_cache_ttl_seconds: float = Field( + default=60.0, + ge=0, + description="TTL for result cache entries in seconds", + ) + result_cache_max_size: int = Field( + default=256, + ge=0, + description="Maximum result cache entries", + ) + enabled: bool = Field( + default=True, + description="Whether the server is active", + ) + + @model_validator(mode="after") + def _validate_transport_fields(self) -> Self: + """Validate transport-specific required fields. + + Stdio transport requires ``command``; streamable_http requires + ``url``. + """ + if self.transport == "stdio" and self.command is None: + msg = f"Server {self.name!r}: stdio transport requires 'command'" + logger.warning( + MCP_CONFIG_VALIDATION_FAILED, + server=self.name, + reason=msg, + ) + raise ValueError(msg) + if self.transport == "streamable_http" and self.url is None: + msg = f"Server {self.name!r}: streamable_http transport requires 'url'" + logger.warning( + MCP_CONFIG_VALIDATION_FAILED, + server=self.name, + reason=msg, + ) + raise ValueError(msg) + return self + + @model_validator(mode="after") + def _validate_tool_filters(self) -> Self: + """Ensure enabled_tools and disabled_tools do not overlap.""" + if self.enabled_tools is not None and self.disabled_tools: + overlap = set(self.enabled_tools) & set(self.disabled_tools) + if overlap: + msg = ( + f"Server {self.name!r}: enabled_tools and " + f"disabled_tools overlap: {sorted(overlap)}" + ) + logger.warning( + MCP_CONFIG_VALIDATION_FAILED, + server=self.name, + reason=msg, + ) + raise ValueError(msg) + return self + + +class MCPConfig(BaseModel): + """Top-level MCP bridge configuration. + + Attributes: + servers: Tuple of MCP server configurations. + """ + + model_config = ConfigDict(frozen=True) + + servers: tuple[MCPServerConfig, ...] = Field( + default=(), + description="MCP server configurations", + ) + + @model_validator(mode="after") + def _validate_unique_server_names(self) -> Self: + """Ensure server names are unique.""" + names = [s.name for s in self.servers] + if len(names) != len(set(names)): + dupes = sorted(n for n, c in Counter(names).items() if c > 1) + msg = f"Duplicate MCP server names: {dupes}" + logger.warning( + MCP_CONFIG_VALIDATION_FAILED, + reason=msg, + ) + raise ValueError(msg) + return self diff --git a/src/ai_company/tools/mcp/errors.py b/src/ai_company/tools/mcp/errors.py new file mode 100644 index 0000000000..7d9c3324ae --- /dev/null +++ b/src/ai_company/tools/mcp/errors.py @@ -0,0 +1,27 @@ +"""MCP bridge error hierarchy. + +All MCP errors extend :class:`~ai_company.tools.errors.ToolError` +and carry an immutable context mapping for structured metadata. +""" + +from ai_company.tools.errors import ToolError + + +class MCPError(ToolError): + """Base exception for MCP bridge errors.""" + + +class MCPConnectionError(MCPError): + """Failed to connect to an MCP server.""" + + +class MCPTimeoutError(MCPError): + """MCP operation timed out.""" + + +class MCPDiscoveryError(MCPError): + """Failed to discover tools from an MCP server.""" + + +class MCPInvocationError(MCPError): + """Failed to invoke an MCP tool.""" diff --git a/src/ai_company/tools/mcp/factory.py b/src/ai_company/tools/mcp/factory.py new file mode 100644 index 0000000000..db90614a99 --- /dev/null +++ b/src/ai_company/tools/mcp/factory.py @@ -0,0 +1,220 @@ +"""MCP tool factory — discovers and creates bridge tools. + +Connects to all enabled MCP servers in parallel, discovers their +tools, and wraps each as an ``MCPBridgeTool``. +""" + +import asyncio +import contextlib +from typing import TYPE_CHECKING + +from ai_company.observability import get_logger +from ai_company.observability.events.mcp import ( + MCP_CLIENT_DISCONNECT_FAILED, + MCP_FACTORY_CLEANUP, + MCP_FACTORY_COMPLETE, + MCP_FACTORY_REUSE_REJECTED, + MCP_FACTORY_SERVER_SKIPPED, + MCP_FACTORY_START, +) +from ai_company.tools.mcp.bridge_tool import MCPBridgeTool +from ai_company.tools.mcp.cache import MCPResultCache +from ai_company.tools.mcp.client import MCPClient + +if TYPE_CHECKING: + from ai_company.tools.mcp.config import MCPConfig, MCPServerConfig + from ai_company.tools.mcp.models import MCPToolInfo + +logger = get_logger(__name__) + + +class MCPToolFactory: + """Factory that connects to MCP servers and creates bridge tools. + + Manages the lifecycle of MCP clients and creates + ``MCPBridgeTool`` instances for all discovered tools. + + Args: + config: MCP bridge configuration. + """ + + def __init__(self, config: MCPConfig) -> None: + self._config = config + self._clients: list[MCPClient] = [] + self._created = False + + async def create_tools(self) -> tuple[MCPBridgeTool, ...]: + """Connect to all enabled servers and create bridge tools. + + Uses ``asyncio.TaskGroup`` for parallel server connections. + Disabled servers are skipped with a log message. + + Returns: + Tuple of all discovered and wrapped bridge tools. + + Raises: + RuntimeError: If called more than once. + MCPConnectionError: If a server connection fails. + MCPDiscoveryError: If tool discovery fails. + """ + if self._created: + msg = "create_tools() must not be called more than once" + logger.warning(MCP_FACTORY_REUSE_REJECTED, reason=msg) + raise RuntimeError(msg) + self._created = True + + enabled = [s for s in self._config.servers if s.enabled] + skipped = len(self._config.servers) - len(enabled) + + logger.info( + MCP_FACTORY_START, + total_servers=len(self._config.servers), + enabled_servers=len(enabled), + skipped_servers=skipped, + ) + + for server in self._config.servers: + if not server.enabled: + logger.info( + MCP_FACTORY_SERVER_SKIPPED, + server=server.name, + reason="disabled", + ) + + if not enabled: + logger.info(MCP_FACTORY_COMPLETE, tool_count=0) + return () + + results = await self._connect_all(enabled) + bridge_tools = self._build_bridge_tools(results) + + logger.info(MCP_FACTORY_COMPLETE, tool_count=len(bridge_tools)) + return bridge_tools + + async def _connect_all( + self, + servers: list[MCPServerConfig], + ) -> list[tuple[MCPClient, tuple[MCPToolInfo, ...]]]: + """Connect to servers in parallel and collect results. + + Args: + servers: Enabled server configurations. + + Returns: + List of (client, tools) tuples. + """ + tasks: list[asyncio.Task[tuple[MCPClient, tuple[MCPToolInfo, ...]]]] = [] + try: + async with asyncio.TaskGroup() as tg: + tasks = [ + tg.create_task( + self._connect_and_discover(cfg), + ) + for cfg in servers + ] + except BaseException: + # Clean up any clients that connected before the failure + logger.warning( + MCP_FACTORY_CLEANUP, + reason="partial failure during parallel connect", + ) + for task in tasks: + if task.done() and not task.cancelled(): + exc = task.exception() + if exc is None: + client, _ = task.result() + with contextlib.suppress(Exception): + await client.disconnect() + raise + + results: list[tuple[MCPClient, tuple[MCPToolInfo, ...]]] = [] + for task in tasks: + client, tools = task.result() + self._clients.append(client) + results.append((client, tools)) + return results + + async def shutdown(self) -> None: + """Disconnect all managed MCP clients.""" + try: + for client in self._clients: + try: + await client.disconnect() + except Exception as exc: + logger.warning( + MCP_CLIENT_DISCONNECT_FAILED, + server=client.server_name, + error=f"disconnect failed: {exc}", + ) + finally: + self._clients.clear() + + # ── Private helpers ────────────────────────────────────────── + + @staticmethod + async def _connect_and_discover( + config: MCPServerConfig, + ) -> tuple[MCPClient, tuple[MCPToolInfo, ...]]: + """Connect to a server and discover its tools. + + Disconnects the client if discovery fails after a + successful connection. + + Args: + config: Server configuration. + + Returns: + Tuple of (connected client, discovered tools). + """ + client = MCPClient(config) + await client.connect() + try: + tools = await client.list_tools() + except BaseException: + await client.disconnect() + raise + return (client, tools) + + def _build_bridge_tools( + self, + results: list[tuple[MCPClient, tuple[MCPToolInfo, ...]]], + ) -> tuple[MCPBridgeTool, ...]: + """Create bridge tools from connected clients. + + Args: + results: List of (client, tools) pairs. + + Returns: + Tuple of ``MCPBridgeTool`` instances. + """ + all_tools: list[MCPBridgeTool] = [] + for client, tools in results: + cache = self._make_cache(client) + for tool_info in tools: + bridge = MCPBridgeTool( + tool_info=tool_info, + client=client, + cache=cache, + ) + all_tools.append(bridge) + return tuple(all_tools) + + @staticmethod + def _make_cache( + client: MCPClient, + ) -> MCPResultCache | None: + """Create a result cache if configured. + + Args: + client: Connected MCP client. + + Returns: + ``MCPResultCache`` or ``None`` if disabled. + """ + config = client.config + if config.result_cache_max_size <= 0: + return None + return MCPResultCache( + max_size=config.result_cache_max_size, + ttl_seconds=config.result_cache_ttl_seconds, + ) diff --git a/src/ai_company/tools/mcp/models.py b/src/ai_company/tools/mcp/models.py new file mode 100644 index 0000000000..6c5a8bbf80 --- /dev/null +++ b/src/ai_company/tools/mcp/models.py @@ -0,0 +1,62 @@ +"""MCP bridge internal value objects. + +Defines ``MCPToolInfo`` for discovered tool metadata and +``MCPRawResult`` for raw MCP call results before mapping. +""" + +from typing import Any + +from pydantic import BaseModel, ConfigDict, Field + +from ai_company.core.types import NotBlankStr # noqa: TC001 + + +class MCPToolInfo(BaseModel): + """Discovered tool metadata from an MCP server. + + Attributes: + name: Tool name as reported by the server. + description: Human-readable tool description. + input_schema: JSON Schema for tool parameters. + server_name: Name of the server that hosts this tool. + """ + + model_config = ConfigDict(frozen=True) + + name: NotBlankStr = Field(description="Tool name") + description: str = Field( + default="", + description="Human-readable tool description", + ) + input_schema: dict[str, Any] = Field( + default_factory=dict, + description="JSON Schema for tool parameters", + ) + server_name: NotBlankStr = Field( + description="Name of the hosting MCP server", + ) + + +class MCPRawResult(BaseModel): + """Raw result from an MCP tool call before mapping. + + Attributes: + content: MCP content blocks from the call result. + is_error: Whether the MCP call reported an error. + structured_content: Optional structured content from the result. + """ + + model_config = ConfigDict(frozen=True) + + content: tuple[Any, ...] = Field( + default=(), + description="MCP content blocks", + ) + is_error: bool = Field( + default=False, + description="Whether the MCP call reported an error", + ) + structured_content: dict[str, Any] | None = Field( + default=None, + description="Optional structured content from the result", + ) diff --git a/src/ai_company/tools/mcp/result_mapper.py b/src/ai_company/tools/mcp/result_mapper.py new file mode 100644 index 0000000000..4aa7cdefe7 --- /dev/null +++ b/src/ai_company/tools/mcp/result_mapper.py @@ -0,0 +1,121 @@ +"""MCP result mapping (ADR-002 D18). + +Pure function that maps MCP raw results to the internal +``ToolExecutionResult`` format used throughout the tool system. +""" + +from typing import TYPE_CHECKING, Any + +from mcp.types import ( + AudioContent, + EmbeddedResource, + ImageContent, + TextContent, +) + +from ai_company.observability import get_logger +from ai_company.observability.events.mcp import ( + MCP_RESULT_ATTACHMENT, + MCP_RESULT_MAPPED, + MCP_RESULT_UNKNOWN_BLOCK, +) +from ai_company.tools.base import ToolExecutionResult + +if TYPE_CHECKING: + from ai_company.tools.mcp.models import MCPRawResult + +logger = get_logger(__name__) + + +def map_call_tool_result(raw: MCPRawResult) -> ToolExecutionResult: + """Map MCP raw result to ToolExecutionResult (ADR-002 D18). + + Mapping rules: + - TextContent blocks: concatenate into content string. + - ImageContent: ``"[image: {mimeType}]"`` placeholder + + base64 in ``metadata["attachments"]``. + - AudioContent: ``"[audio: {mimeType}]"`` placeholder + + base64 in ``metadata["attachments"]``. + - EmbeddedResource: ``"[resource: {uri}]"`` placeholder. + - structuredContent: ``metadata["structured_content"]``. + - isError: maps 1:1 to ``is_error``. + + Args: + raw: Raw MCP result to map. + + Returns: + Mapped ``ToolExecutionResult``. + """ + parts: list[str] = [] + attachments: list[dict[str, Any]] = [] + + for block in raw.content: + if isinstance(block, TextContent): + parts.append(block.text) + elif isinstance(block, ImageContent): + parts.append(f"[image: {block.mimeType}]") + attachments.append( + { + "type": "image", + "mimeType": block.mimeType, + "data": block.data, + }, + ) + elif isinstance(block, AudioContent): + parts.append(f"[audio: {block.mimeType}]") + attachments.append( + { + "type": "audio", + "mimeType": block.mimeType, + "data": block.data, + }, + ) + elif isinstance(block, EmbeddedResource): + uri = _extract_resource_uri(block) + parts.append(f"[resource: {uri}]") + else: + block_type = type(block).__name__ + logger.warning( + MCP_RESULT_UNKNOWN_BLOCK, + unknown_block_type=block_type, + ) + parts.append(f"[unknown: {block_type}]") + + content = "\n".join(parts) if parts else "" + metadata: dict[str, Any] = {} + + if attachments: + metadata["attachments"] = attachments + logger.debug( + MCP_RESULT_ATTACHMENT, + attachment_count=len(attachments), + ) + + if raw.structured_content is not None: + metadata["structured_content"] = raw.structured_content + + logger.debug( + MCP_RESULT_MAPPED, + block_count=len(raw.content), + has_attachments=bool(attachments), + has_structured=raw.structured_content is not None, + is_error=raw.is_error, + ) + + return ToolExecutionResult( + content=content, + is_error=raw.is_error, + metadata=metadata, + ) + + +def _extract_resource_uri(block: EmbeddedResource) -> str: + """Extract URI string from an EmbeddedResource block. + + Args: + block: The embedded resource block. + + Returns: + The resource URI as a string. + """ + return str(block.resource.uri) diff --git a/src/ai_company/tools/sandbox/__init__.py b/src/ai_company/tools/sandbox/__init__.py index ce98ec968f..1a4067371e 100644 --- a/src/ai_company/tools/sandbox/__init__.py +++ b/src/ai_company/tools/sandbox/__init__.py @@ -1,17 +1,23 @@ -"""Subprocess sandbox for tool execution isolation.""" +"""Sandbox backends for tool execution isolation.""" from .config import SubprocessSandboxConfig +from .docker_config import DockerSandboxConfig +from .docker_sandbox import DockerSandbox from .errors import SandboxError, SandboxStartError, SandboxTimeoutError from .protocol import SandboxBackend from .result import SandboxResult +from .sandboxing_config import SandboxingConfig from .subprocess_sandbox import SubprocessSandbox __all__ = [ + "DockerSandbox", + "DockerSandboxConfig", "SandboxBackend", "SandboxError", "SandboxResult", "SandboxStartError", "SandboxTimeoutError", + "SandboxingConfig", "SubprocessSandbox", "SubprocessSandboxConfig", ] diff --git a/src/ai_company/tools/sandbox/docker_config.py b/src/ai_company/tools/sandbox/docker_config.py new file mode 100644 index 0000000000..4e4307a7f6 --- /dev/null +++ b/src/ai_company/tools/sandbox/docker_config.py @@ -0,0 +1,89 @@ +"""Docker sandbox configuration model.""" + +from typing import Literal, Self + +from pydantic import BaseModel, ConfigDict, Field, model_validator + +from ai_company.core.types import NotBlankStr # noqa: TC001 + +_VALID_NETWORK_MODES = frozenset({"none", "bridge", "host"}) + + +class DockerSandboxConfig(BaseModel): + """Configuration for the Docker sandbox backend. + + Attributes: + image: Docker image to use for sandbox containers. + network: Default Docker network mode. + network_overrides: Per-category network mode overrides. + allowed_hosts: Host:port allowlist for network filtering. + memory_limit: Container memory limit (Docker format). + cpu_limit: CPU core limit for the container. + timeout_seconds: Default command timeout in seconds. + mount_mode: Workspace mount mode (read-write or read-only). + runtime: Optional container runtime (e.g. ``"runsc"`` for gVisor). + """ + + model_config = ConfigDict(frozen=True) + + image: NotBlankStr = Field( + default="ai-company-sandbox:latest", + description="Docker image to use for sandbox containers", + ) + network: Literal["none", "bridge", "host"] = Field( + default="none", + description="Default Docker network mode", + ) + network_overrides: dict[NotBlankStr, NotBlankStr] = Field( + default_factory=dict, + description="Per-category network mode overrides", + ) + allowed_hosts: tuple[NotBlankStr, ...] = Field( + default=(), + description="Host:port allowlist for network filtering", + ) + memory_limit: NotBlankStr = Field( + default="512m", + description="Container memory limit (Docker format, e.g. '512m')", + ) + cpu_limit: float = Field(default=1.0, gt=0, le=16) + timeout_seconds: float = Field(default=120.0, gt=0, le=600) + mount_mode: Literal["rw", "ro"] = Field( + default="ro", + description="Workspace mount mode (read-only by default)", + ) + runtime: NotBlankStr | None = Field( + default=None, + description="Optional container runtime (e.g. 'runsc' for gVisor)", + ) + + @model_validator(mode="after") + def _validate_memory_limit(self) -> Self: + """Validate that memory_limit is a parseable Docker memory value.""" + limit = self.memory_limit.strip().lower() + if not limit: + msg = "Memory limit must not be empty" + raise ValueError(msg) + multipliers = {"k", "m", "g"} + numeric_part = limit[:-1] if limit[-1] in multipliers else limit + try: + value = int(numeric_part) + except ValueError as exc: + msg = f"Invalid memory_limit format: {self.memory_limit!r}" + raise ValueError(msg) from exc + if value <= 0: + msg = f"Memory limit must be positive, got: {self.memory_limit!r}" + raise ValueError(msg) + return self + + @model_validator(mode="after") + def _validate_network_overrides(self) -> Self: + """Ensure network override values are valid network modes.""" + for category, mode in self.network_overrides.items(): + if mode not in _VALID_NETWORK_MODES: + msg = ( + f"Invalid network mode {mode!r} for category " + f"{category!r}; must be one of {sorted(_VALID_NETWORK_MODES)}" + ) + raise ValueError(msg) + return self diff --git a/src/ai_company/tools/sandbox/docker_sandbox.py b/src/ai_company/tools/sandbox/docker_sandbox.py new file mode 100644 index 0000000000..667fe8740f --- /dev/null +++ b/src/ai_company/tools/sandbox/docker_sandbox.py @@ -0,0 +1,647 @@ +"""Docker-based sandbox backend. + +Executes commands inside ephemeral Docker containers with workspace +mount, resource limits, network isolation, and timeout management. +Uses ``aiodocker`` for asynchronous Docker daemon communication. +""" + +import asyncio +import platform +from pathlib import Path, PurePosixPath +from typing import TYPE_CHECKING, Any, Final + +import aiodocker + +from ai_company.observability import get_logger +from ai_company.observability.events.docker import ( + DOCKER_CLEANUP, + DOCKER_CONTAINER_CREATED, + DOCKER_CONTAINER_REMOVE_FAILED, + DOCKER_CONTAINER_REMOVED, + DOCKER_CONTAINER_STOP_FAILED, + DOCKER_CONTAINER_STOPPED, + DOCKER_DAEMON_UNAVAILABLE, + DOCKER_EXECUTE_FAILED, + DOCKER_EXECUTE_START, + DOCKER_EXECUTE_SUCCESS, + DOCKER_EXECUTE_TIMEOUT, + DOCKER_HEALTH_CHECK, +) +from ai_company.tools.sandbox.docker_config import DockerSandboxConfig +from ai_company.tools.sandbox.errors import SandboxError, SandboxStartError +from ai_company.tools.sandbox.result import SandboxResult + +if TYPE_CHECKING: + from collections.abc import Mapping + + from ai_company.core.types import NotBlankStr + +logger = get_logger(__name__) + +_DEFAULT_CONFIG = DockerSandboxConfig() +_NANO_CPUS_MULTIPLIER: Final[int] = 1_000_000_000 +_CONTAINER_WORKSPACE: Final[str] = "/workspace" +_STOP_TIMEOUT_SECONDS: Final[int] = 5 +_DRIVE_SEPARATOR_PARTS: Final[int] = 2 + + +def _to_posix_bind_path(path: Path) -> str: + r"""Convert a host path to POSIX format for Docker bind mounts. + + On Windows, converts ``C:\Users\foo`` to ``/c/Users/foo`` + for Docker Desktop compatibility. + + Args: + path: Host filesystem path to convert. + + Returns: + POSIX-formatted path string suitable for Docker bind mounts. + """ + if platform.system() == "Windows": + posix = PurePosixPath(path.as_posix()) + parts = str(posix).split(":", 1) + if len(parts) == _DRIVE_SEPARATOR_PARTS: + drive = parts[0].lstrip("/").lower() + rest = parts[1] + return f"/{drive}{rest}" + return str(path) + + +class DockerSandbox: + """Docker sandbox backend. + + Runs commands in ephemeral Docker containers with workspace mounts, + resource limits (memory, CPU), network isolation, and timeout + management. + + Attributes: + config: Docker sandbox configuration. + workspace: Absolute path to the workspace root directory. + """ + + def __init__( + self, + *, + config: DockerSandboxConfig | None = None, + workspace: Path, + ) -> None: + """Initialize the Docker sandbox. + + Args: + config: Docker sandbox configuration (defaults to standard). + workspace: Absolute path to the workspace root. Must exist. + + Raises: + ValueError: If *workspace* is not absolute or does not exist. + """ + if not workspace.is_absolute(): + msg = f"workspace must be an absolute path, got: {workspace}" + logger.warning(DOCKER_EXECUTE_FAILED, error=msg) + raise ValueError(msg) + resolved = workspace.resolve() + if not resolved.is_dir(): + msg = f"workspace directory does not exist: {resolved}" + logger.warning(DOCKER_EXECUTE_FAILED, error=msg) + raise ValueError(msg) + self._config = config or _DEFAULT_CONFIG + self._workspace = resolved + self._docker: aiodocker.Docker | None = None + self._tracked_containers: list[str] = [] + self._lock = asyncio.Lock() + + @property + def config(self) -> DockerSandboxConfig: + """Docker sandbox configuration.""" + return self._config + + @property + def workspace(self) -> Path: + """Workspace root directory.""" + return self._workspace + + async def _ensure_docker(self) -> aiodocker.Docker: + """Lazily connect to the Docker daemon. + + Serialized with ``_lock`` to prevent duplicate client creation + from concurrent calls. + + Returns: + An ``aiodocker.Docker`` client instance. + + Raises: + SandboxStartError: If the Docker daemon is unavailable. + """ + async with self._lock: + if self._docker is not None: + return self._docker + client = aiodocker.Docker() + try: + await client.version() + except Exception as exc: + await client.close() + logger.exception( + DOCKER_DAEMON_UNAVAILABLE, + error=str(exc), + ) + msg = f"Docker daemon unavailable: {exc}" + raise SandboxStartError(msg) from exc + self._docker = client + return client + + def _validate_cwd(self, cwd: Path) -> None: + """Validate that *cwd* is within the workspace boundary. + + Args: + cwd: Working directory to validate. + + Raises: + SandboxError: If *cwd* is outside the workspace. + """ + try: + cwd.resolve().relative_to(self._workspace) + except ValueError as exc: + msg = f"Working directory '{cwd}' is outside workspace '{self._workspace}'" + logger.warning( + DOCKER_EXECUTE_FAILED, + error=msg, + cwd=str(cwd), + workspace=str(self._workspace), + ) + raise SandboxError(msg) from exc + + def _resolve_cwd_in_container(self, cwd: Path | None) -> str: + """Map a host cwd to a container-internal path. + + Args: + cwd: Host working directory, or ``None`` for workspace root. + + Returns: + POSIX path inside the container. + """ + if cwd is None: + return _CONTAINER_WORKSPACE + rel = cwd.resolve().relative_to(self._workspace) + return str(PurePosixPath(_CONTAINER_WORKSPACE) / rel) + + def _build_container_config( + self, + *, + command: str, + args: tuple[str, ...], + container_cwd: str, + env_overrides: Mapping[str, str] | None, + ) -> dict[str, Any]: + """Build the Docker container creation config. + + Args: + command: Executable name or path. + args: Command arguments. + container_cwd: Working directory inside the container. + env_overrides: Environment variables for the container. + + Returns: + A dict suitable for ``aiodocker`` container creation. + """ + bind_path = _to_posix_bind_path(self._workspace) + mount_mode = self._config.mount_mode + bind_str = f"{bind_path}:{_CONTAINER_WORKSPACE}:{mount_mode}" + + env_list = [f"{k}={v}" for k, v in (env_overrides or {}).items()] + + memory_bytes = self._parse_memory_limit( + self._config.memory_limit, + ) + nano_cpus = int(self._config.cpu_limit * _NANO_CPUS_MULTIPLIER) + + host_config: dict[str, Any] = { + "Binds": [bind_str], + "Tmpfs": {"/tmp": "size=64m,noexec,nosuid"}, # noqa: S108 + "Memory": memory_bytes, + "NanoCpus": nano_cpus, + "NetworkMode": self._config.network, + "AutoRemove": False, + "PidsLimit": 64, + "ReadonlyRootfs": True, + "CapDrop": ["ALL"], + } + if self._config.runtime is not None: + host_config["Runtime"] = self._config.runtime + # TODO(#50): allowed_hosts is not yet enforced at runtime; + # needs iptables/nftables rules or Docker network plugin. + + return { + "Image": self._config.image, + "Cmd": [command, *args], + "WorkingDir": container_cwd, + "Env": env_list, + "HostConfig": host_config, + "AttachStdout": True, + "AttachStderr": True, + } + + @staticmethod + def _parse_memory_limit(limit: str) -> int: + """Parse a Docker memory limit string to bytes. + + Supports suffixes ``k``, ``m``, ``g`` (case-insensitive). + + Args: + limit: Memory limit string (e.g. ``"512m"``). + + Returns: + Memory limit in bytes. + + Raises: + ValueError: If the format is invalid. + """ + limit_lower = limit.strip().lower() + if not limit_lower: + msg = "Memory limit must not be empty" + raise ValueError(msg) + multipliers = {"k": 1024, "m": 1024**2, "g": 1024**3} + if limit_lower[-1] in multipliers: + result = int(limit_lower[:-1]) * multipliers[limit_lower[-1]] + else: + result = int(limit_lower) + if result <= 0: + msg = f"Memory limit must be positive, got: {limit!r}" + raise ValueError(msg) + return result + + async def execute( + self, + *, + command: str, + args: tuple[str, ...], + cwd: Path | None = None, + env_overrides: Mapping[str, str] | None = None, + timeout: float | None = None, # noqa: ASYNC109 + ) -> SandboxResult: + """Execute a command inside a Docker container. + + Args: + command: Executable name or path. + args: Command arguments. + cwd: Working directory (defaults to workspace root). + env_overrides: Extra env vars (only these — no host leakage). + timeout: Seconds before the container is killed. Clamped + to ``config.timeout_seconds`` if larger. + + Returns: + A ``SandboxResult`` with captured output and exit status. + + Raises: + SandboxStartError: If the Docker daemon or image is unavailable. + SandboxError: If cwd is outside the workspace boundary. + """ + work_dir = cwd if cwd is not None else self._workspace + self._validate_cwd(work_dir) + + effective_timeout = min( + timeout if timeout is not None else self._config.timeout_seconds, + self._config.timeout_seconds, + ) + container_cwd = self._resolve_cwd_in_container(cwd) + + logger.debug( + DOCKER_EXECUTE_START, + command=command, + args=args, + cwd=container_cwd, + timeout=effective_timeout, + image=self._config.image, + ) + + docker = await self._ensure_docker() + return await self._run_container( + docker=docker, + command=command, + args=args, + container_cwd=container_cwd, + env_overrides=env_overrides, + timeout=effective_timeout, + ) + + async def _run_container( # noqa: PLR0913 + self, + *, + docker: aiodocker.Docker, + command: str, + args: tuple[str, ...], + container_cwd: str, + env_overrides: Mapping[str, str] | None, + timeout: float, # noqa: ASYNC109 + ) -> SandboxResult: + """Create, start, and wait for a container. + + Args: + docker: Docker client. + command: Executable name or path. + args: Command arguments. + container_cwd: Container working directory. + env_overrides: Environment variables. + timeout: Timeout in seconds. + + Returns: + A ``SandboxResult`` with captured output and exit status. + """ + config = self._build_container_config( + command=command, + args=args, + container_cwd=container_cwd, + env_overrides=env_overrides, + ) + + try: + container = await docker.containers.create(config) + except Exception as exc: + msg = f"Failed to create container: {exc}" + logger.exception( + DOCKER_EXECUTE_FAILED, + command=command, + error=msg, + ) + raise SandboxStartError(msg) from exc + + container_id = container.id + self._tracked_containers = [ + *self._tracked_containers, + container_id, + ] + logger.debug( + DOCKER_CONTAINER_CREATED, + container_id=container_id[:12], + image=self._config.image, + ) + + try: + return await self._start_and_wait( + docker=docker, + container_id=container_id, + command=command, + args=args, + timeout=timeout, + ) + finally: + await self._remove_container(docker, container_id) + self._tracked_containers = [ + c for c in self._tracked_containers if c != container_id + ] + + async def _start_and_wait( + self, + *, + docker: aiodocker.Docker, + container_id: str, + command: str, + args: tuple[str, ...], + timeout: float, # noqa: ASYNC109 + ) -> SandboxResult: + """Start a container and wait for completion or timeout. + + Args: + docker: Docker client. + container_id: Container ID. + command: Command (for logging). + args: Args (for logging). + timeout: Timeout in seconds. + + Returns: + A ``SandboxResult``. + """ + container_obj = docker.containers.container(container_id) + try: + await container_obj.start() + except Exception as exc: + msg = f"Failed to start container {container_id[:12]}: {exc}" + logger.exception( + DOCKER_EXECUTE_FAILED, + container_id=container_id[:12], + error=msg, + ) + raise SandboxStartError(msg) from exc + + timed_out, returncode = await self._wait_for_exit( + docker=docker, + container_obj=container_obj, + container_id=container_id, + timeout=timeout, + ) + stdout, stderr = await self._safe_collect_logs( + container_obj, + container_id, + ) + self._log_execution_outcome( + command, + args, + container_id, + returncode, + stderr, + ) + if timed_out: + return SandboxResult( + stdout=stdout, + stderr=stderr or f"Container timed out after {timeout}s", + returncode=returncode, + timed_out=True, + ) + return SandboxResult( + stdout=stdout, + stderr=stderr, + returncode=returncode, + ) + + async def _wait_for_exit( + self, + *, + docker: aiodocker.Docker, + container_obj: aiodocker.containers.DockerContainer, + container_id: str, + timeout: float, # noqa: ASYNC109 + ) -> tuple[bool, int]: + """Wait for the container to exit or timeout. + + Returns: + Tuple of (timed_out, returncode). + """ + try: + response = await asyncio.wait_for( + container_obj.wait(), + timeout=timeout, + ) + return (False, response.get("StatusCode", -1)) + except TimeoutError: + logger.warning( + DOCKER_EXECUTE_TIMEOUT, + container_id=container_id[:12], + timeout=timeout, + ) + await self._stop_container(docker, container_id) + return (True, -1) + + async def _safe_collect_logs( + self, + container_obj: aiodocker.containers.DockerContainer, + container_id: str, + ) -> tuple[str, str]: + """Collect logs, returning empty strings on failure.""" + try: + return await self._collect_logs(container_obj) + except Exception as exc: + logger.warning( + DOCKER_EXECUTE_FAILED, + container_id=container_id[:12], + error=f"Log collection failed: {exc}", + ) + return ("", "") + + @staticmethod + def _log_execution_outcome( + command: str, + args: tuple[str, ...], + container_id: str, + returncode: int, + stderr: str, + ) -> None: + """Log the execution outcome at the appropriate level.""" + max_stderr_log = 200 + if returncode != 0: + logger.warning( + DOCKER_EXECUTE_FAILED, + command=command, + args=args, + returncode=returncode, + stderr_length=len(stderr), + stderr_head=stderr[:max_stderr_log], + ) + else: + logger.debug( + DOCKER_EXECUTE_SUCCESS, + command=command, + args=args, + container_id=container_id[:12], + ) + + @staticmethod + async def _collect_logs( + container_obj: aiodocker.containers.DockerContainer, + ) -> tuple[str, str]: + """Collect stdout and stderr logs from a container. + + Args: + container_obj: Docker container object. + + Returns: + Tuple of (stdout, stderr) as strings. + """ + stdout_logs = await container_obj.log( + stdout=True, + stderr=False, + ) + stderr_logs = await container_obj.log( + stdout=False, + stderr=True, + ) + stdout = "".join(stdout_logs) + stderr = "".join(stderr_logs) + return stdout, stderr + + @staticmethod + async def _stop_container( + docker: aiodocker.Docker, + container_id: str, + ) -> None: + """Stop a running container. + + Args: + docker: Docker client. + container_id: Container ID to stop. + """ + try: + container_obj = docker.containers.container(container_id) + await container_obj.stop( + t=_STOP_TIMEOUT_SECONDS, + ) + logger.debug( + DOCKER_CONTAINER_STOPPED, + container_id=container_id[:12], + ) + except Exception as exc: + logger.warning( + DOCKER_CONTAINER_STOP_FAILED, + container_id=container_id[:12], + error=str(exc), + ) + + @staticmethod + async def _remove_container( + docker: aiodocker.Docker, + container_id: str, + ) -> None: + """Remove a container, forcing removal if necessary. + + Args: + docker: Docker client. + container_id: Container ID to remove. + """ + try: + container_obj = docker.containers.container(container_id) + await container_obj.delete(force=True) + logger.debug( + DOCKER_CONTAINER_REMOVED, + container_id=container_id[:12], + ) + except Exception as exc: + logger.warning( + DOCKER_CONTAINER_REMOVE_FAILED, + container_id=container_id[:12], + error=str(exc), + ) + + async def cleanup(self) -> None: + """Stop and remove tracked containers, then close the Docker session.""" + logger.debug( + DOCKER_CLEANUP, + tracked_count=len(self._tracked_containers), + ) + if self._docker is not None: + for cid in self._tracked_containers: + await self._stop_container(self._docker, cid) + await self._remove_container(self._docker, cid) + try: + await self._docker.close() + except Exception as exc: + logger.warning( + DOCKER_CLEANUP, + error=f"Docker client close failed: {exc}", + ) + finally: + self._docker = None + self._tracked_containers = [] + + async def health_check(self) -> bool: + """Return ``True`` if the Docker daemon is reachable. + + Returns: + ``True`` if healthy, ``False`` otherwise. + """ + try: + docker = await self._ensure_docker() + await docker.version() + except Exception as exc: + logger.warning( + DOCKER_HEALTH_CHECK, + healthy=False, + error=str(exc), + ) + return False + else: + logger.debug( + DOCKER_HEALTH_CHECK, + healthy=True, + ) + return True + + def get_backend_type(self) -> NotBlankStr: + """Return ``'docker'``.""" + return "docker" diff --git a/src/ai_company/tools/sandbox/errors.py b/src/ai_company/tools/sandbox/errors.py index e7162719ba..6d8d5e4fce 100644 --- a/src/ai_company/tools/sandbox/errors.py +++ b/src/ai_company/tools/sandbox/errors.py @@ -14,13 +14,11 @@ class SandboxError(ToolError): class SandboxTimeoutError(SandboxError): """Execution was killed because it exceeded the timeout. - Note: ``SubprocessSandbox`` signals timeouts via - ``SandboxResult.timed_out`` rather than raising this exception, - so callers can access partial output. This class exists for - future sandbox backends (e.g. Docker) that may raise on timeout - instead of returning a result. + Reserved for sandbox backends that need to signal timeout as an + exception rather than a result flag. Currently unused — both + subprocess and Docker return ``SandboxResult.timed_out`` instead. """ class SandboxStartError(SandboxError): - """Failed to start the sandboxed subprocess.""" + """Failed to start the sandbox execution environment.""" diff --git a/src/ai_company/tools/sandbox/protocol.py b/src/ai_company/tools/sandbox/protocol.py index 92efccc5c1..0385ba2cde 100644 --- a/src/ai_company/tools/sandbox/protocol.py +++ b/src/ai_company/tools/sandbox/protocol.py @@ -19,8 +19,7 @@ class SandboxBackend(Protocol): Implementations execute commands in an isolated environment with environment filtering, workspace enforcement, and timeout support. - Subprocess is the initial backend; Docker/K8s are planned for - future milestones. + Subprocess and Docker are built-in backends. """ async def execute( @@ -38,7 +37,7 @@ async def execute( command: Executable name or path. args: Command arguments. cwd: Working directory (defaults to sandbox workspace root). - env_overrides: Extra env vars applied on top of filtered env. + env_overrides: Extra environment variables for the sandbox. timeout: Seconds before the process is killed. Falls back to the backend's default timeout if ``None``. diff --git a/src/ai_company/tools/sandbox/sandboxing_config.py b/src/ai_company/tools/sandbox/sandboxing_config.py new file mode 100644 index 0000000000..b83d264ccb --- /dev/null +++ b/src/ai_company/tools/sandbox/sandboxing_config.py @@ -0,0 +1,59 @@ +"""Top-level sandboxing configuration model.""" + +from typing import Literal, Self + +from pydantic import BaseModel, ConfigDict, Field, model_validator + +from ai_company.observability import get_logger +from ai_company.tools.sandbox.config import SubprocessSandboxConfig +from ai_company.tools.sandbox.docker_config import DockerSandboxConfig + +logger = get_logger(__name__) + +_VALID_BACKENDS = frozenset({"subprocess", "docker"}) +_BackendName = Literal["subprocess", "docker"] + + +class SandboxingConfig(BaseModel): + """Top-level sandboxing configuration choosing backend per category. + + Attributes: + default_backend: Default sandbox backend for all tool categories. + overrides: Per-category backend overrides (category name to backend). + subprocess: Subprocess sandbox backend configuration. + docker: Docker sandbox backend configuration. + """ + + model_config = ConfigDict(frozen=True) + + default_backend: _BackendName = "subprocess" + overrides: dict[str, _BackendName] = Field(default_factory=dict) + subprocess: SubprocessSandboxConfig = Field( + default_factory=SubprocessSandboxConfig, + ) + docker: DockerSandboxConfig = Field( + default_factory=DockerSandboxConfig, + ) + + @model_validator(mode="after") + def _validate_override_backends(self) -> Self: + """Ensure override values are valid backend names.""" + for category, backend in self.overrides.items(): + if backend not in _VALID_BACKENDS: + msg = ( + f"Invalid backend {backend!r} for category " + f"{category!r}; must be one of {sorted(_VALID_BACKENDS)}" + ) + raise ValueError(msg) + return self + + def backend_for_category(self, category: str) -> _BackendName: + """Return the backend name for a given tool category. + + Args: + category: Tool category name. + + Returns: + The backend name (``"subprocess"`` or ``"docker"``). + """ + return self.overrides.get(category, self.default_backend) diff --git a/tests/integration/tools/test_docker_sandbox_integration.py b/tests/integration/tools/test_docker_sandbox_integration.py new file mode 100644 index 0000000000..d01a98ffbd --- /dev/null +++ b/tests/integration/tools/test_docker_sandbox_integration.py @@ -0,0 +1,105 @@ +"""Integration tests for Docker sandbox with real Docker daemon. + +These tests require a running Docker daemon and the sandbox image. +They are skipped automatically if Docker is unavailable. +""" + +import asyncio +from typing import TYPE_CHECKING + +import pytest + +from ai_company.tools.sandbox.docker_config import DockerSandboxConfig +from ai_company.tools.sandbox.docker_sandbox import DockerSandbox + +if TYPE_CHECKING: + from pathlib import Path + +pytestmark = [pytest.mark.integration, pytest.mark.timeout(60)] + + +_TEST_IMAGE = "python:3.12-slim" + + +def _docker_and_image_available() -> bool: + """Check if Docker daemon is reachable and test image exists.""" + try: + import aiodocker + + async def _check() -> bool: + client = None + try: + client = aiodocker.Docker() + await client.version() + await client.images.inspect(_TEST_IMAGE) + except Exception: + return False + else: + return True + finally: + if client is not None: + await client.close() + + return asyncio.run(_check()) + except Exception: + return False + + +skip_no_docker = pytest.mark.skipif( + not _docker_and_image_available(), + reason=f"Docker daemon not available or {_TEST_IMAGE} not pulled", +) + + +@skip_no_docker +class TestDockerSandboxRealExecution: + """Real Docker execution tests.""" + + async def test_run_python_code(self, tmp_path: Path) -> None: + """Execute Python code in a real Docker container.""" + config = DockerSandboxConfig( + image=_TEST_IMAGE, + timeout_seconds=30, + ) + sandbox = DockerSandbox( + config=config, + workspace=tmp_path, + ) + try: + result = await sandbox.execute( + command="python3", + args=("-c", "print('hello from docker')"), + ) + assert result.success + assert "hello from docker" in result.stdout + finally: + await sandbox.cleanup() + + async def test_run_with_timeout(self, tmp_path: Path) -> None: + """Timeout kills the container.""" + config = DockerSandboxConfig( + image=_TEST_IMAGE, + timeout_seconds=120, + ) + sandbox = DockerSandbox( + config=config, + workspace=tmp_path, + ) + try: + result = await sandbox.execute( + command="sleep", + args=("60",), + timeout=2.0, + ) + assert result.timed_out + assert not result.success + finally: + await sandbox.cleanup() + + async def test_health_check(self, tmp_path: Path) -> None: + """Health check returns True with running daemon.""" + sandbox = DockerSandbox(workspace=tmp_path) + try: + assert await sandbox.health_check() is True + finally: + await sandbox.cleanup() diff --git a/tests/integration/tools/test_mcp_integration.py b/tests/integration/tools/test_mcp_integration.py new file mode 100644 index 0000000000..f07fe9c5c4 --- /dev/null +++ b/tests/integration/tools/test_mcp_integration.py @@ -0,0 +1,213 @@ +"""Integration test: MCP bridge full pipeline with mock server.""" + +from unittest.mock import AsyncMock, MagicMock + +import pytest +from mcp.types import TextContent + +from ai_company.tools.base import ToolExecutionResult +from ai_company.tools.mcp.bridge_tool import MCPBridgeTool +from ai_company.tools.mcp.cache import MCPResultCache +from ai_company.tools.mcp.client import MCPClient +from ai_company.tools.mcp.config import MCPServerConfig + +pytestmark = [pytest.mark.integration, pytest.mark.timeout(30)] + + +def _make_connected_client( + config: MCPServerConfig, + tools: list[MagicMock], + call_result: MagicMock, +) -> MCPClient: + """Create a mock-connected MCPClient.""" + client = MCPClient(config) + session = AsyncMock() + + list_result = MagicMock() + list_result.tools = tools + session.list_tools = AsyncMock(return_value=list_result) + session.call_tool = AsyncMock(return_value=call_result) + client._session = session + return client + + +class TestMCPBridgeFullPipeline: + """End-to-end: discover -> bridge -> execute -> map result.""" + + async def test_discover_and_execute_tool(self) -> None: + """Full pipeline: discover tool, create bridge, execute.""" + config = MCPServerConfig( + name="test-server", + transport="stdio", + command="echo", + ) + + # Mock MCP tool + mock_tool = MagicMock() + mock_tool.name = "search" + mock_tool.description = "Search documents" + mock_tool.inputSchema = { + "type": "object", + "properties": {"query": {"type": "string"}}, + } + + # Mock call result + call_result = MagicMock() + call_result.content = [ + TextContent(type="text", text="Found 3 results"), + ] + call_result.isError = False + call_result.structuredContent = None + + client = _make_connected_client( + config, + [mock_tool], + call_result, + ) + + # Discover + tools = await client.list_tools() + assert len(tools) == 1 + assert tools[0].name == "search" + + # Create bridge + bridge = MCPBridgeTool( + tool_info=tools[0], + client=client, + ) + assert bridge.name == "mcp_test-server_search" + + # Execute + result = await bridge.execute( + arguments={"query": "test"}, + ) + assert isinstance(result, ToolExecutionResult) + assert result.content == "Found 3 results" + assert not result.is_error + + async def test_pipeline_with_cache(self) -> None: + """Full pipeline with result caching.""" + config = MCPServerConfig( + name="cached-server", + transport="stdio", + command="echo", + ) + + mock_tool = MagicMock() + mock_tool.name = "lookup" + mock_tool.description = "Lookup" + mock_tool.inputSchema = {} + + call_result = MagicMock() + call_result.content = [ + TextContent(type="text", text="cached result"), + ] + call_result.isError = False + call_result.structuredContent = None + + client = _make_connected_client( + config, + [mock_tool], + call_result, + ) + + tools = await client.list_tools() + cache = MCPResultCache(max_size=10, ttl_seconds=60.0) + + bridge = MCPBridgeTool( + tool_info=tools[0], + client=client, + cache=cache, + ) + + # First call hits server + r1 = await bridge.execute(arguments={}) + assert r1.content == "cached result" + + # Change server response + new_result = MagicMock() + new_result.content = [ + TextContent(type="text", text="new result"), + ] + new_result.isError = False + new_result.structuredContent = None + client._session.call_tool = AsyncMock( # type: ignore[method-assign, union-attr] + return_value=new_result, + ) + + # Second call should use cache + r2 = await bridge.execute(arguments={}) + assert r2.content == "cached result" + + async def test_pipeline_with_error_result(self) -> None: + """Pipeline handles MCP error results correctly.""" + config = MCPServerConfig( + name="err-server", + transport="stdio", + command="echo", + ) + + mock_tool = MagicMock() + mock_tool.name = "failing" + mock_tool.description = "Might fail" + mock_tool.inputSchema = {} + + call_result = MagicMock() + call_result.content = [ + TextContent(type="text", text="Permission denied"), + ] + call_result.isError = True + call_result.structuredContent = None + + client = _make_connected_client( + config, + [mock_tool], + call_result, + ) + + tools = await client.list_tools() + bridge = MCPBridgeTool( + tool_info=tools[0], + client=client, + ) + + result = await bridge.execute(arguments={}) + assert result.is_error + assert result.content == "Permission denied" + + async def test_pipeline_with_filters(self) -> None: + """Pipeline respects enabled/disabled filters.""" + config = MCPServerConfig( + name="filter-server", + transport="stdio", + command="echo", + enabled_tools=("allowed",), + disabled_tools=(), + ) + + tool_allowed = MagicMock() + tool_allowed.name = "allowed" + tool_allowed.description = "Allowed tool" + tool_allowed.inputSchema = {} + + tool_blocked = MagicMock() + tool_blocked.name = "blocked" + tool_blocked.description = "Blocked tool" + tool_blocked.inputSchema = {} + + call_result = MagicMock() + call_result.content = [ + TextContent(type="text", text="ok"), + ] + call_result.isError = False + call_result.structuredContent = None + + client = _make_connected_client( + config, + [tool_allowed, tool_blocked], + call_result, + ) + + tools = await client.list_tools() + assert len(tools) == 1 + assert tools[0].name == "allowed" diff --git a/tests/unit/api/test_bus_bridge.py b/tests/unit/api/test_bus_bridge.py index 8cfc543edb..c690c0553d 100644 --- a/tests/unit/api/test_bus_bridge.py +++ b/tests/unit/api/test_bus_bridge.py @@ -136,6 +136,7 @@ async def failing_subscribe(channel_name: str, subscriber_id: str) -> None: class TestPollChannel: async def test_circuit_breaker_after_max_errors(self) -> None: """Polling stops after _MAX_CONSECUTIVE_ERRORS failures.""" + from unittest.mock import patch from litestar.channels import ChannelsPlugin from litestar.channels.backends.memory import MemoryChannelsBackend @@ -167,6 +168,7 @@ async def failing_receive( channels=ALL_CHANNELS, ) bridge = MessageBusBridge(bus, plugin) - # Run _poll_channel directly - await bridge._poll_channel("tasks") + # Patch _POLL_TIMEOUT to 0 so sleeps between errors are instant + with patch("ai_company.api.bus_bridge._POLL_TIMEOUT", 0.0): + await bridge._poll_channel("tasks") assert call_count >= _MAX_CONSECUTIVE_ERRORS diff --git a/tests/unit/config/conftest.py b/tests/unit/config/conftest.py index 92ba020f57..235ac5d389 100644 --- a/tests/unit/config/conftest.py +++ b/tests/unit/config/conftest.py @@ -25,6 +25,8 @@ from ai_company.memory.config import CompanyMemoryConfig from ai_company.memory.org.config import OrgMemoryConfig from ai_company.persistence.config import PersistenceConfig +from ai_company.tools.mcp.config import MCPConfig +from ai_company.tools.sandbox.sandboxing_config import SandboxingConfig if TYPE_CHECKING: from pathlib import Path @@ -84,6 +86,8 @@ class RootConfigFactory(ModelFactory[RootConfig]): persistence = PersistenceConfig() cost_tiers = CostTiersConfig() org_memory = OrgMemoryConfig() + sandboxing = SandboxingConfig() + mcp = MCPConfig() # ── Sample YAML strings ────────────────────────────────────────── diff --git a/tests/unit/observability/test_events.py b/tests/unit/observability/test_events.py index 8e96ad5762..ebea1ff629 100644 --- a/tests/unit/observability/test_events.py +++ b/tests/unit/observability/test_events.py @@ -182,11 +182,14 @@ def test_all_domain_modules_discovered(self) -> None: "company", "config", "conflict", + "code_runner", "correlation", "decomposition", "delegation", + "docker", "execution", "git", + "mcp", "meeting", "memory", "parallel", diff --git a/tests/unit/tools/mcp/__init__.py b/tests/unit/tools/mcp/__init__.py new file mode 100644 index 0000000000..d8268aaa9b --- /dev/null +++ b/tests/unit/tools/mcp/__init__.py @@ -0,0 +1 @@ +"""MCP bridge unit tests.""" diff --git a/tests/unit/tools/mcp/conftest.py b/tests/unit/tools/mcp/conftest.py new file mode 100644 index 0000000000..4b95348df7 --- /dev/null +++ b/tests/unit/tools/mcp/conftest.py @@ -0,0 +1,169 @@ +"""Shared fixtures for MCP bridge unit tests.""" + +from typing import Any +from unittest.mock import AsyncMock, MagicMock + +import pytest + +from ai_company.tools.base import ToolExecutionResult +from ai_company.tools.mcp.cache import MCPResultCache +from ai_company.tools.mcp.client import MCPClient +from ai_company.tools.mcp.config import MCPConfig, MCPServerConfig +from ai_company.tools.mcp.models import MCPRawResult, MCPToolInfo + +# ── Sample configs ─────────────────────────────────────────────── + + +@pytest.fixture +def stdio_server_config() -> MCPServerConfig: + """Minimal stdio server config.""" + return MCPServerConfig( + name="test-stdio", + transport="stdio", + command="echo", + args=("hello",), + ) + + +@pytest.fixture +def http_server_config() -> MCPServerConfig: + """Minimal streamable HTTP server config.""" + return MCPServerConfig( + name="test-http", + transport="streamable_http", + url="http://localhost:8080/mcp", + ) + + +@pytest.fixture +def disabled_server_config() -> MCPServerConfig: + """Disabled server config.""" + return MCPServerConfig( + name="test-disabled", + transport="stdio", + command="noop", + enabled=False, + ) + + +@pytest.fixture +def sample_mcp_config( + stdio_server_config: MCPServerConfig, + http_server_config: MCPServerConfig, +) -> MCPConfig: + """Config with two enabled servers.""" + return MCPConfig( + servers=(stdio_server_config, http_server_config), + ) + + +# ── Sample models ──────────────────────────────────────────────── + + +@pytest.fixture +def sample_tool_info() -> MCPToolInfo: + """Sample discovered tool metadata.""" + return MCPToolInfo( + name="test-tool", + description="A test tool", + input_schema={ + "type": "object", + "properties": {"query": {"type": "string"}}, + }, + server_name="test-server", + ) + + +@pytest.fixture +def sample_raw_result() -> MCPRawResult: + """Sample raw MCP result with no content.""" + return MCPRawResult() + + +@pytest.fixture +def sample_execution_result() -> ToolExecutionResult: + """Sample tool execution result.""" + return ToolExecutionResult(content="hello world") + + +# ── Mock MCP session ───────────────────────────────────────────── + + +def _make_mock_mcp_tool( + name: str = "mock-tool", + description: str = "A mock tool", + input_schema: dict[str, Any] | None = None, +) -> MagicMock: + """Create a mock MCP Tool object.""" + tool = MagicMock() + tool.name = name + tool.description = description + tool.inputSchema = input_schema or { + "type": "object", + "properties": {"input": {"type": "string"}}, + } + return tool + + +def _make_mock_list_tools_result( + tools: list[MagicMock] | None = None, +) -> MagicMock: + """Create a mock ListToolsResult.""" + result = MagicMock() + result.tools = tools or [_make_mock_mcp_tool()] + return result + + +def _make_mock_call_tool_result( + content: list[Any] | None = None, + is_error: bool = False, + structured_content: dict[str, Any] | None = None, +) -> MagicMock: + """Create a mock CallToolResult.""" + from mcp.types import TextContent + + result = MagicMock() + result.content = content or [ + TextContent(type="text", text="result text"), + ] + result.isError = is_error + result.structuredContent = structured_content + return result + + +@pytest.fixture +def mock_session() -> AsyncMock: + """Mock MCP ClientSession.""" + session = AsyncMock() + session.initialize = AsyncMock() + session.list_tools = AsyncMock( + return_value=_make_mock_list_tools_result(), + ) + session.call_tool = AsyncMock( + return_value=_make_mock_call_tool_result(), + ) + return session + + +@pytest.fixture +def mock_client( + stdio_server_config: MCPServerConfig, +) -> MCPClient: + """MCPClient with mocked internals for unit testing.""" + client = MCPClient(stdio_server_config) + # Manually set session to simulate connected state + mock_session = AsyncMock() + mock_session.list_tools = AsyncMock( + return_value=_make_mock_list_tools_result(), + ) + mock_session.call_tool = AsyncMock( + return_value=_make_mock_call_tool_result(), + ) + client._session = mock_session + return client + + +@pytest.fixture +def result_cache() -> MCPResultCache: + """Small result cache for testing.""" + return MCPResultCache(max_size=4, ttl_seconds=1.0) diff --git a/tests/unit/tools/mcp/test_bridge_tool.py b/tests/unit/tools/mcp/test_bridge_tool.py new file mode 100644 index 0000000000..ce4970c4c0 --- /dev/null +++ b/tests/unit/tools/mcp/test_bridge_tool.py @@ -0,0 +1,216 @@ +"""Tests for MCPBridgeTool.""" + +from typing import TYPE_CHECKING +from unittest.mock import AsyncMock + +import pytest + +from ai_company.core.enums import ToolCategory +from ai_company.tools.base import ToolExecutionResult +from ai_company.tools.mcp.bridge_tool import MCPBridgeTool +from ai_company.tools.mcp.client import MCPClient +from ai_company.tools.mcp.config import MCPServerConfig +from ai_company.tools.mcp.errors import MCPInvocationError +from ai_company.tools.mcp.models import MCPToolInfo + +if TYPE_CHECKING: + from ai_company.tools.mcp.cache import MCPResultCache + +pytestmark = [pytest.mark.unit, pytest.mark.timeout(30)] + + +@pytest.fixture +def bridge_tool( + sample_tool_info: MCPToolInfo, + mock_client: MCPClient, +) -> MCPBridgeTool: + """Bridge tool without cache.""" + return MCPBridgeTool( + tool_info=sample_tool_info, + client=mock_client, + ) + + +@pytest.fixture +def bridge_tool_with_cache( + sample_tool_info: MCPToolInfo, + mock_client: MCPClient, + result_cache: MCPResultCache, +) -> MCPBridgeTool: + """Bridge tool with cache.""" + return MCPBridgeTool( + tool_info=sample_tool_info, + client=mock_client, + cache=result_cache, + ) + + +class TestBridgeToolConstruction: + """Name construction and properties.""" + + def test_name_format( + self, + bridge_tool: MCPBridgeTool, + ) -> None: + assert bridge_tool.name == "mcp_test-server_test-tool" + + def test_category_is_mcp( + self, + bridge_tool: MCPBridgeTool, + ) -> None: + assert bridge_tool.category == ToolCategory.MCP + + def test_description_from_tool_info( + self, + bridge_tool: MCPBridgeTool, + ) -> None: + assert bridge_tool.description == "A test tool" + + def test_parameters_schema_from_tool_info( + self, + bridge_tool: MCPBridgeTool, + ) -> None: + schema = bridge_tool.parameters_schema + assert schema is not None + assert "properties" in schema + + def test_tool_info_property( + self, + bridge_tool: MCPBridgeTool, + sample_tool_info: MCPToolInfo, + ) -> None: + assert bridge_tool.tool_info == sample_tool_info + + def test_empty_input_schema_yields_none_parameters(self) -> None: + tool_info = MCPToolInfo( + name="no-schema", + description="No schema", + input_schema={}, + server_name="srv", + ) + config = MCPServerConfig( + name="srv", + transport="stdio", + command="echo", + ) + client = MCPClient(config) + client._session = AsyncMock() + bridge = MCPBridgeTool( + tool_info=tool_info, + client=client, + ) + assert bridge.parameters_schema is None + + +class TestBridgeToolExecute: + """Execute delegation to client.""" + + async def test_execute_calls_client( + self, + bridge_tool: MCPBridgeTool, + ) -> None: + result = await bridge_tool.execute( + arguments={"query": "test"}, + ) + assert isinstance(result, ToolExecutionResult) + assert result.content == "result text" + + async def test_execute_returns_error_on_mcp_failure( + self, + bridge_tool: MCPBridgeTool, + mock_client: MCPClient, + ) -> None: + mock_client._session.call_tool.side_effect = MCPInvocationError( # type: ignore[union-attr] + "invocation error", + context={"server": "test", "tool": "test"}, + ) + result = await bridge_tool.execute(arguments={}) + assert result.is_error + assert "invocation error" in result.content + + +class TestBridgeToolWithCache: + """Cache integration.""" + + async def test_cache_miss_calls_client( + self, + bridge_tool_with_cache: MCPBridgeTool, + mock_client: MCPClient, + ) -> None: + result = await bridge_tool_with_cache.execute( + arguments={"q": "first"}, + ) + assert result.content == "result text" + mock_client._session.call_tool.assert_called_once() # type: ignore[union-attr] + + async def test_cache_hit_skips_client( + self, + bridge_tool_with_cache: MCPBridgeTool, + mock_client: MCPClient, + ) -> None: + # First call populates cache + await bridge_tool_with_cache.execute( + arguments={"q": "cached"}, + ) + mock_client._session.call_tool.reset_mock() # type: ignore[union-attr] + + # Second call should use cache + result = await bridge_tool_with_cache.execute( + arguments={"q": "cached"}, + ) + assert result.content == "result text" + mock_client._session.call_tool.assert_not_called() # type: ignore[union-attr] + + async def test_different_args_not_cached( + self, + bridge_tool_with_cache: MCPBridgeTool, + mock_client: MCPClient, + ) -> None: + await bridge_tool_with_cache.execute(arguments={"q": "a"}) + mock_client._session.call_tool.reset_mock() # type: ignore[union-attr] + + await bridge_tool_with_cache.execute(arguments={"q": "b"}) + mock_client._session.call_tool.assert_called_once() # type: ignore[union-attr] + + async def test_unhashable_arguments_bypass_cache( + self, + sample_tool_info: MCPToolInfo, + mock_client: MCPClient, + result_cache: MCPResultCache, + ) -> None: + """Unhashable args (e.g. custom objects) don't crash execution.""" + + class Unhashable: + __hash__ = None # type: ignore[assignment] + + bridge = MCPBridgeTool( + tool_info=sample_tool_info, + client=mock_client, + cache=result_cache, + ) + result = await bridge.execute( + arguments={"obj": Unhashable()}, + ) + assert isinstance(result, ToolExecutionResult) + assert not result.is_error + + async def test_error_results_not_cached( + self, + sample_tool_info: MCPToolInfo, + mock_client: MCPClient, + result_cache: MCPResultCache, + ) -> None: + """Error results should not be stored in the cache.""" + mock_client._session.call_tool.side_effect = MCPInvocationError( # type: ignore[union-attr] + "transient error", + context={"server": "test", "tool": "test"}, + ) + bridge = MCPBridgeTool( + tool_info=sample_tool_info, + client=mock_client, + cache=result_cache, + ) + result = await bridge.execute(arguments={"q": "fail"}) + assert result.is_error + # Cache should be empty — error not cached + assert result_cache.get("test-tool", {"q": "fail"}) is None diff --git a/tests/unit/tools/mcp/test_cache.py b/tests/unit/tools/mcp/test_cache.py new file mode 100644 index 0000000000..e8378cf0a1 --- /dev/null +++ b/tests/unit/tools/mcp/test_cache.py @@ -0,0 +1,191 @@ +"""Tests for MCP result cache.""" + +import time +from unittest.mock import patch + +import pytest + +from ai_company.tools.base import ToolExecutionResult +from ai_company.tools.mcp.cache import MCPResultCache, _make_hashable + +pytestmark = [pytest.mark.unit, pytest.mark.timeout(30)] + + +class TestCacheHitMiss: + """Basic cache hit/miss behavior.""" + + def test_miss_returns_none( + self, + result_cache: MCPResultCache, + ) -> None: + assert result_cache.get("tool", {}) is None + + def test_put_then_get_returns_result( + self, + result_cache: MCPResultCache, + ) -> None: + result = ToolExecutionResult(content="cached") + result_cache.put("tool", {"key": "val"}, result) + cached = result_cache.get("tool", {"key": "val"}) + assert cached is not None + assert cached.content == "cached" + + def test_different_args_different_entries( + self, + result_cache: MCPResultCache, + ) -> None: + r1 = ToolExecutionResult(content="r1") + r2 = ToolExecutionResult(content="r2") + result_cache.put("tool", {"a": 1}, r1) + result_cache.put("tool", {"a": 2}, r2) + assert result_cache.get("tool", {"a": 1}).content == "r1" # type: ignore[union-attr] + assert result_cache.get("tool", {"a": 2}).content == "r2" # type: ignore[union-attr] + + def test_different_tools_different_entries( + self, + result_cache: MCPResultCache, + ) -> None: + r1 = ToolExecutionResult(content="t1") + r2 = ToolExecutionResult(content="t2") + result_cache.put("tool1", {}, r1) + result_cache.put("tool2", {}, r2) + assert result_cache.get("tool1", {}).content == "t1" # type: ignore[union-attr] + assert result_cache.get("tool2", {}).content == "t2" # type: ignore[union-attr] + + +class TestCacheTTL: + """TTL expiry behavior.""" + + def test_expired_entry_returns_none(self) -> None: + cache = MCPResultCache(max_size=10, ttl_seconds=0.5) + result = ToolExecutionResult(content="old") + cache.put("tool", {}, result) + + # Mock time to simulate expiry + original_time = time.monotonic() + with patch("ai_company.tools.mcp.cache.time") as mock_time: + mock_time.monotonic.return_value = original_time + 1.0 + assert cache.get("tool", {}) is None + + def test_fresh_entry_returns_result(self) -> None: + cache = MCPResultCache(max_size=10, ttl_seconds=60.0) + result = ToolExecutionResult(content="fresh") + cache.put("tool", {}, result) + assert cache.get("tool", {}).content == "fresh" # type: ignore[union-attr] + + +class TestCacheLRUEviction: + """LRU eviction when at capacity.""" + + def test_oldest_evicted_at_capacity(self) -> None: + cache = MCPResultCache(max_size=2, ttl_seconds=60.0) + r1 = ToolExecutionResult(content="r1") + r2 = ToolExecutionResult(content="r2") + r3 = ToolExecutionResult(content="r3") + + cache.put("t1", {}, r1) + cache.put("t2", {}, r2) + # This should evict t1 (oldest) + cache.put("t3", {}, r3) + + assert cache.get("t1", {}) is None + assert cache.get("t2", {}).content == "r2" # type: ignore[union-attr] + assert cache.get("t3", {}).content == "r3" # type: ignore[union-attr] + + def test_access_refreshes_position(self) -> None: + cache = MCPResultCache(max_size=2, ttl_seconds=60.0) + r1 = ToolExecutionResult(content="r1") + r2 = ToolExecutionResult(content="r2") + r3 = ToolExecutionResult(content="r3") + + cache.put("t1", {}, r1) + cache.put("t2", {}, r2) + # Access t1 to move it to end + cache.get("t1", {}) + # This should evict t2 (now oldest) + cache.put("t3", {}, r3) + + assert cache.get("t1", {}).content == "r1" # type: ignore[union-attr] + assert cache.get("t2", {}) is None + assert cache.get("t3", {}).content == "r3" # type: ignore[union-attr] + + def test_zero_max_size_stores_nothing(self) -> None: + cache = MCPResultCache(max_size=0, ttl_seconds=60.0) + result = ToolExecutionResult(content="ignored") + cache.put("tool", {}, result) + assert cache.get("tool", {}) is None + + +class TestCacheInvalidate: + """Cache invalidation.""" + + def test_invalidate_specific_tool( + self, + result_cache: MCPResultCache, + ) -> None: + r1 = ToolExecutionResult(content="r1") + r2 = ToolExecutionResult(content="r2") + result_cache.put("tool1", {"a": 1}, r1) + result_cache.put("tool1", {"a": 2}, r1) + result_cache.put("tool2", {}, r2) + + result_cache.invalidate("tool1") + + assert result_cache.get("tool1", {"a": 1}) is None + assert result_cache.get("tool1", {"a": 2}) is None + assert result_cache.get("tool2", {}).content == "r2" # type: ignore[union-attr] + + def test_invalidate_all( + self, + result_cache: MCPResultCache, + ) -> None: + r1 = ToolExecutionResult(content="r1") + r2 = ToolExecutionResult(content="r2") + result_cache.put("tool1", {}, r1) + result_cache.put("tool2", {}, r2) + + result_cache.invalidate() + + assert result_cache.get("tool1", {}) is None + assert result_cache.get("tool2", {}) is None + + +class TestMakeHashable: + """Recursive hashable conversion.""" + + def test_dict_to_frozenset(self) -> None: + result = _make_hashable({"a": 1, "b": 2}) + assert isinstance(result, frozenset) + + def test_list_to_tuple(self) -> None: + result = _make_hashable([1, 2, 3]) + assert result == (1, 2, 3) + + def test_nested_dict(self) -> None: + result = _make_hashable({"outer": {"inner": "val"}}) + assert isinstance(result, frozenset) + + def test_primitive_passthrough(self) -> None: + assert _make_hashable(42) == 42 + assert _make_hashable("str") == "str" + assert _make_hashable(None) is None + + def test_mixed_nested(self) -> None: + result = _make_hashable( + {"key": [1, {"nested": True}]}, + ) + assert isinstance(result, frozenset) + # Should be hashable + hash(result) + + def test_tuple_to_tuple(self) -> None: + result = _make_hashable((1, 2, 3)) + assert result == (1, 2, 3) + + def test_empty_dict(self) -> None: + result = _make_hashable({}) + assert result == frozenset() + + def test_empty_list(self) -> None: + result = _make_hashable([]) + assert result == () diff --git a/tests/unit/tools/mcp/test_client.py b/tests/unit/tools/mcp/test_client.py new file mode 100644 index 0000000000..d1d8a7cb16 --- /dev/null +++ b/tests/unit/tools/mcp/test_client.py @@ -0,0 +1,369 @@ +"""Tests for MCPClient.""" + +import asyncio +from unittest.mock import AsyncMock, MagicMock, patch + +import pytest + +from ai_company.tools.mcp.client import MCPClient +from ai_company.tools.mcp.config import MCPServerConfig +from ai_company.tools.mcp.errors import ( + MCPConnectionError, + MCPDiscoveryError, + MCPInvocationError, + MCPTimeoutError, +) +from ai_company.tools.mcp.models import MCPToolInfo + +pytestmark = [pytest.mark.unit, pytest.mark.timeout(30)] + + +class TestMCPClientConnection: + """Connection lifecycle tests.""" + + def test_not_connected_initially( + self, + stdio_server_config: MCPServerConfig, + ) -> None: + client = MCPClient(stdio_server_config) + assert not client.is_connected + + def test_server_name_property( + self, + stdio_server_config: MCPServerConfig, + ) -> None: + client = MCPClient(stdio_server_config) + assert client.server_name == "test-stdio" + + async def test_connect_sets_session( + self, + stdio_server_config: MCPServerConfig, + ) -> None: + client = MCPClient(stdio_server_config) + mock_session = AsyncMock() + mock_session.initialize = AsyncMock() + + with ( + patch( + "ai_company.tools.mcp.client.stdio_client", + ) as mock_stdio, + patch( + "ai_company.tools.mcp.client.ClientSession", + ) as mock_cls, + ): + mock_cm = AsyncMock() + mock_cm.__aenter__ = AsyncMock( + return_value=(AsyncMock(), AsyncMock()), + ) + mock_cm.__aexit__ = AsyncMock(return_value=False) + mock_stdio.return_value = mock_cm + + session_cm = AsyncMock() + session_cm.__aenter__ = AsyncMock( + return_value=mock_session, + ) + session_cm.__aexit__ = AsyncMock(return_value=False) + mock_cls.return_value = session_cm + + await client.connect() + + assert client.is_connected + + async def test_connect_timeout_raises_mcp_connection_error(self) -> None: + config = MCPServerConfig( + name="slow-server", + transport="stdio", + command="echo", + connect_timeout_seconds=0.01, + ) + client = MCPClient(config) + + async def hang_forever(*_a: object, **_kw: object) -> None: + await asyncio.sleep(100) + + with ( + patch.object( + client, + "_connect_with_stack", + side_effect=hang_forever, + ), + pytest.raises(MCPConnectionError, match="timed out"), + ): + await client.connect() + + async def test_connect_failure_raises_mcp_connection_error( + self, + stdio_server_config: MCPServerConfig, + ) -> None: + client = MCPClient(stdio_server_config) + with ( + patch( + "ai_company.tools.mcp.client.stdio_client", + side_effect=OSError("connection refused"), + ), + pytest.raises(MCPConnectionError, match="refused"), + ): + await client.connect() + + async def test_disconnect_clears_session( + self, + mock_client: MCPClient, + ) -> None: + assert mock_client.is_connected + mock_client._exit_stack = AsyncMock() + mock_client._exit_stack.aclose = AsyncMock() + await mock_client.disconnect() + assert not mock_client.is_connected + + async def test_disconnect_when_not_connected( + self, + stdio_server_config: MCPServerConfig, + ) -> None: + client = MCPClient(stdio_server_config) + # Should not raise + await client.disconnect() + assert not client.is_connected + + async def test_disconnect_clears_state_on_aclose_error( + self, + mock_client: MCPClient, + ) -> None: + mock_client._exit_stack = AsyncMock() + mock_client._exit_stack.aclose = AsyncMock( + side_effect=RuntimeError("cleanup failed"), + ) + # Should not raise — error is logged and swallowed + await mock_client.disconnect() + assert not mock_client.is_connected + assert mock_client._exit_stack is None + + def test_config_property( + self, + stdio_server_config: MCPServerConfig, + ) -> None: + client = MCPClient(stdio_server_config) + assert client.config is stdio_server_config + + +class TestMCPClientListTools: + """Tool discovery tests.""" + + async def test_list_tools_returns_tool_info( + self, + mock_client: MCPClient, + ) -> None: + tools = await mock_client.list_tools() + assert len(tools) == 1 + assert isinstance(tools[0], MCPToolInfo) + assert tools[0].name == "mock-tool" + assert tools[0].server_name == "test-stdio" + + async def test_list_tools_not_connected_raises( + self, + stdio_server_config: MCPServerConfig, + ) -> None: + client = MCPClient(stdio_server_config) + with pytest.raises(MCPConnectionError, match="Not connected"): + await client.list_tools() + + async def test_list_tools_discovery_error( + self, + mock_client: MCPClient, + ) -> None: + mock_client._session.list_tools.side_effect = RuntimeError( # type: ignore[union-attr] + "discovery failed", + ) + with pytest.raises(MCPDiscoveryError, match="discovery failed"): + await mock_client.list_tools() + + async def test_list_tools_applies_enabled_filter(self) -> None: + config = MCPServerConfig( + name="filtered", + transport="stdio", + command="echo", + enabled_tools=("allowed-tool",), + ) + client = MCPClient(config) + mock_session = AsyncMock() + + tool1 = MagicMock() + tool1.name = "allowed-tool" + tool1.description = "allowed" + tool1.inputSchema = {} + + tool2 = MagicMock() + tool2.name = "blocked-tool" + tool2.description = "blocked" + tool2.inputSchema = {} + + mock_result = MagicMock() + mock_result.tools = [tool1, tool2] + mock_session.list_tools = AsyncMock(return_value=mock_result) + client._session = mock_session + + tools = await client.list_tools() + assert len(tools) == 1 + assert tools[0].name == "allowed-tool" + + async def test_list_tools_applies_disabled_filter(self) -> None: + config = MCPServerConfig( + name="filtered", + transport="stdio", + command="echo", + disabled_tools=("blocked-tool",), + ) + client = MCPClient(config) + mock_session = AsyncMock() + + tool1 = MagicMock() + tool1.name = "allowed-tool" + tool1.description = "allowed" + tool1.inputSchema = {} + + tool2 = MagicMock() + tool2.name = "blocked-tool" + tool2.description = "blocked" + tool2.inputSchema = {} + + mock_result = MagicMock() + mock_result.tools = [tool1, tool2] + mock_session.list_tools = AsyncMock(return_value=mock_result) + client._session = mock_session + + tools = await client.list_tools() + assert len(tools) == 1 + assert tools[0].name == "allowed-tool" + + +class TestMCPClientCallTool: + """Tool invocation tests.""" + + async def test_call_tool_returns_raw_result( + self, + mock_client: MCPClient, + ) -> None: + result = await mock_client.call_tool("mock-tool", {"a": 1}) + assert len(result.content) == 1 + assert not result.is_error + + async def test_call_tool_not_connected_raises( + self, + stdio_server_config: MCPServerConfig, + ) -> None: + client = MCPClient(stdio_server_config) + with pytest.raises(MCPConnectionError, match="Not connected"): + await client.call_tool("tool", {}) + + async def test_call_tool_timeout_raises( + self, + mock_client: MCPClient, + ) -> None: + mock_client._session.call_tool.side_effect = TimeoutError() # type: ignore[union-attr] + with pytest.raises(MCPTimeoutError, match="timed out"): + await mock_client.call_tool("slow-tool", {}) + + async def test_call_tool_error_raises( + self, + mock_client: MCPClient, + ) -> None: + mock_client._session.call_tool.side_effect = RuntimeError( # type: ignore[union-attr] + "invocation failed", + ) + with pytest.raises( + MCPInvocationError, + match="invocation failed", + ): + await mock_client.call_tool("bad-tool", {}) + + +class TestMCPClientReconnect: + """Reconnect behavior.""" + + async def test_reconnect_disconnects_then_connects( + self, + mock_client: MCPClient, + ) -> None: + mock_client._exit_stack = AsyncMock() + mock_client._exit_stack.aclose = AsyncMock() + + with ( + patch.object( + mock_client, + "disconnect", + new_callable=AsyncMock, + ) as mock_disconnect, + patch.object( + mock_client, + "connect", + new_callable=AsyncMock, + ) as mock_connect, + ): + await mock_client.reconnect() + mock_disconnect.assert_called_once() + mock_connect.assert_called_once() + + +class TestMCPClientContextManager: + """Async context manager protocol.""" + + async def test_context_manager( + self, + stdio_server_config: MCPServerConfig, + ) -> None: + client = MCPClient(stdio_server_config) + with ( + patch.object( + client, + "connect", + new_callable=AsyncMock, + ) as mock_connect, + patch.object( + client, + "disconnect", + new_callable=AsyncMock, + ) as mock_disconnect, + ): + async with client as c: + assert c is client + mock_connect.assert_called_once() + mock_disconnect.assert_called_once() + + +class TestMCPClientHTTPTransport: + """HTTP transport connection path.""" + + async def test_connect_http_sets_session(self) -> None: + config = MCPServerConfig( + name="test-http", + transport="streamable_http", + url="http://localhost:8080/mcp", + ) + client = MCPClient(config) + mock_session = AsyncMock() + mock_session.initialize = AsyncMock() + + with ( + patch( + "ai_company.tools.mcp.client.streamablehttp_client", + ) as mock_http, + patch( + "ai_company.tools.mcp.client.ClientSession", + ) as mock_cls, + ): + mock_cm = AsyncMock() + mock_cm.__aenter__ = AsyncMock( + return_value=(AsyncMock(), AsyncMock(), AsyncMock()), + ) + mock_cm.__aexit__ = AsyncMock(return_value=False) + mock_http.return_value = mock_cm + + session_cm = AsyncMock() + session_cm.__aenter__ = AsyncMock( + return_value=mock_session, + ) + session_cm.__aexit__ = AsyncMock(return_value=False) + mock_cls.return_value = session_cm + + await client.connect() + + assert client.is_connected diff --git a/tests/unit/tools/mcp/test_config.py b/tests/unit/tools/mcp/test_config.py new file mode 100644 index 0000000000..6bad97843b --- /dev/null +++ b/tests/unit/tools/mcp/test_config.py @@ -0,0 +1,243 @@ +"""Tests for MCP configuration models.""" + +import pytest +from pydantic import ValidationError + +from ai_company.tools.mcp.config import MCPConfig, MCPServerConfig + +pytestmark = [pytest.mark.unit, pytest.mark.timeout(30)] + + +class TestMCPServerConfigStdio: + """Stdio transport validation.""" + + def test_valid_stdio(self) -> None: + cfg = MCPServerConfig( + name="s1", + transport="stdio", + command="node", + args=("server.js",), + ) + assert cfg.name == "s1" + assert cfg.transport == "stdio" + assert cfg.command == "node" + assert cfg.args == ("server.js",) + + def test_stdio_requires_command(self) -> None: + with pytest.raises(ValidationError, match="requires 'command'"): + MCPServerConfig( + name="s1", + transport="stdio", + ) + + def test_stdio_with_env(self) -> None: + cfg = MCPServerConfig( + name="s1", + transport="stdio", + command="node", + env={"NODE_ENV": "test"}, + ) + assert cfg.env == {"NODE_ENV": "test"} + + +class TestMCPServerConfigHTTP: + """Streamable HTTP transport validation.""" + + def test_valid_http(self) -> None: + cfg = MCPServerConfig( + name="s1", + transport="streamable_http", + url="http://localhost:8080/mcp", + ) + assert cfg.url == "http://localhost:8080/mcp" + + def test_http_requires_url(self) -> None: + with pytest.raises(ValidationError, match="requires 'url'"): + MCPServerConfig( + name="s1", + transport="streamable_http", + ) + + def test_http_with_headers(self) -> None: + cfg = MCPServerConfig( + name="s1", + transport="streamable_http", + url="http://localhost:8080", + headers={"Authorization": "Bearer test"}, + ) + assert cfg.headers == {"Authorization": "Bearer test"} + + +class TestMCPServerConfigToolFilters: + """Enabled/disabled tool filter validation.""" + + def test_enabled_tools_only(self) -> None: + cfg = MCPServerConfig( + name="s1", + transport="stdio", + command="node", + enabled_tools=("tool_a", "tool_b"), + ) + assert cfg.enabled_tools == ("tool_a", "tool_b") + + def test_disabled_tools_only(self) -> None: + cfg = MCPServerConfig( + name="s1", + transport="stdio", + command="node", + disabled_tools=("tool_c",), + ) + assert cfg.disabled_tools == ("tool_c",) + + def test_overlap_rejected(self) -> None: + with pytest.raises(ValidationError, match="overlap"): + MCPServerConfig( + name="s1", + transport="stdio", + command="node", + enabled_tools=("tool_a", "tool_b"), + disabled_tools=("tool_b",), + ) + + def test_no_overlap_allowed(self) -> None: + cfg = MCPServerConfig( + name="s1", + transport="stdio", + command="node", + enabled_tools=("tool_a",), + disabled_tools=("tool_c",), + ) + assert cfg.enabled_tools == ("tool_a",) + assert cfg.disabled_tools == ("tool_c",) + + +class TestMCPServerConfigDefaults: + """Default values and boundaries.""" + + def test_defaults(self) -> None: + cfg = MCPServerConfig( + name="s1", + transport="stdio", + command="echo", + ) + assert cfg.timeout_seconds == 30.0 + assert cfg.connect_timeout_seconds == 10.0 + assert cfg.result_cache_ttl_seconds == 60.0 + assert cfg.result_cache_max_size == 256 + assert cfg.enabled is True + assert cfg.enabled_tools is None + assert cfg.disabled_tools == () + + def test_timeout_bounds(self) -> None: + with pytest.raises(ValidationError): + MCPServerConfig( + name="s1", + transport="stdio", + command="echo", + timeout_seconds=0, + ) + with pytest.raises(ValidationError): + MCPServerConfig( + name="s1", + transport="stdio", + command="echo", + timeout_seconds=601, + ) + + def test_frozen(self) -> None: + cfg = MCPServerConfig( + name="s1", + transport="stdio", + command="echo", + ) + with pytest.raises(ValidationError): + cfg.name = "changed" # type: ignore[misc] + + def test_invalid_transport(self) -> None: + with pytest.raises(ValidationError): + MCPServerConfig( + name="s1", + transport="invalid", # type: ignore[arg-type] + command="echo", + ) + + +class TestMCPConfig: + """Top-level MCP config validation.""" + + def test_empty_servers(self) -> None: + cfg = MCPConfig() + assert cfg.servers == () + + def test_single_server(self) -> None: + server = MCPServerConfig( + name="s1", + transport="stdio", + command="echo", + ) + cfg = MCPConfig(servers=(server,)) + assert len(cfg.servers) == 1 + + def test_duplicate_server_names_rejected(self) -> None: + server1 = MCPServerConfig( + name="same", + transport="stdio", + command="echo", + ) + server2 = MCPServerConfig( + name="same", + transport="streamable_http", + url="http://localhost", + ) + with pytest.raises(ValidationError, match="Duplicate"): + MCPConfig(servers=(server1, server2)) + + def test_unique_server_names_allowed(self) -> None: + server1 = MCPServerConfig( + name="s1", + transport="stdio", + command="echo", + ) + server2 = MCPServerConfig( + name="s2", + transport="streamable_http", + url="http://localhost", + ) + cfg = MCPConfig(servers=(server1, server2)) + assert len(cfg.servers) == 2 + + def test_frozen(self) -> None: + cfg = MCPConfig() + with pytest.raises(ValidationError): + cfg.servers = () # type: ignore[misc] + + +class TestMCPServerConfigBounds: + """Additional field boundary tests.""" + + def test_connect_timeout_exceeds_max_rejected(self) -> None: + with pytest.raises(ValidationError): + MCPServerConfig( + name="s1", + transport="stdio", + command="echo", + connect_timeout_seconds=121, + ) + + def test_result_cache_ttl_negative_rejected(self) -> None: + with pytest.raises(ValidationError): + MCPServerConfig( + name="s1", + transport="stdio", + command="echo", + result_cache_ttl_seconds=-1, + ) + + def test_result_cache_ttl_zero_accepted(self) -> None: + cfg = MCPServerConfig( + name="s1", + transport="stdio", + command="echo", + result_cache_ttl_seconds=0, + ) + assert cfg.result_cache_ttl_seconds == 0 diff --git a/tests/unit/tools/mcp/test_errors.py b/tests/unit/tools/mcp/test_errors.py new file mode 100644 index 0000000000..b82e185d9f --- /dev/null +++ b/tests/unit/tools/mcp/test_errors.py @@ -0,0 +1,88 @@ +"""Tests for the MCP error hierarchy.""" + +import pytest + +from ai_company.tools.errors import ToolError +from ai_company.tools.mcp.errors import ( + MCPConnectionError, + MCPDiscoveryError, + MCPError, + MCPInvocationError, + MCPTimeoutError, +) + +pytestmark = [pytest.mark.unit, pytest.mark.timeout(30)] + + +class TestMCPErrorHierarchy: + """All MCP errors extend MCPError which extends ToolError.""" + + @pytest.mark.parametrize( + "error_cls", + [ + MCPError, + MCPConnectionError, + MCPTimeoutError, + MCPDiscoveryError, + MCPInvocationError, + ], + ) + def test_isinstance_tool_error( + self, + error_cls: type[MCPError], + ) -> None: + err = error_cls("test message") + assert isinstance(err, ToolError) + assert isinstance(err, MCPError) + + @pytest.mark.parametrize( + "error_cls", + [ + MCPConnectionError, + MCPTimeoutError, + MCPDiscoveryError, + MCPInvocationError, + ], + ) + def test_isinstance_mcp_error( + self, + error_cls: type[MCPError], + ) -> None: + err = error_cls("test") + assert isinstance(err, MCPError) + + +class TestMCPErrorContext: + """Context propagation through the error hierarchy.""" + + def test_context_propagated(self) -> None: + ctx = {"server": "test", "tool": "foo"} + err = MCPInvocationError("failed", context=ctx) + assert err.context["server"] == "test" + assert err.context["tool"] == "foo" + + def test_context_defaults_empty(self) -> None: + err = MCPConnectionError("conn failed") + assert len(err.context) == 0 + + def test_context_immutable(self) -> None: + err = MCPTimeoutError( + "timed out", + context={"key": "val"}, + ) + with pytest.raises(TypeError): + err.context["new_key"] = "new_val" # type: ignore[index] + + def test_message_attribute(self) -> None: + err = MCPDiscoveryError("discovery failed") + assert err.message == "discovery failed" + assert str(err) == "discovery failed" + + def test_str_with_context(self) -> None: + err = MCPError( + "base error", + context={"server": "s1"}, + ) + result = str(err) + assert "base error" in result + assert "server" in result diff --git a/tests/unit/tools/mcp/test_factory.py b/tests/unit/tools/mcp/test_factory.py new file mode 100644 index 0000000000..bf7730fc2e --- /dev/null +++ b/tests/unit/tools/mcp/test_factory.py @@ -0,0 +1,325 @@ +"""Tests for MCPToolFactory.""" + +from unittest.mock import AsyncMock, MagicMock, patch + +import pytest + +from ai_company.tools.mcp.bridge_tool import MCPBridgeTool +from ai_company.tools.mcp.client import MCPClient +from ai_company.tools.mcp.config import MCPConfig, MCPServerConfig +from ai_company.tools.mcp.factory import MCPToolFactory +from ai_company.tools.mcp.models import MCPToolInfo + +pytestmark = [pytest.mark.unit, pytest.mark.timeout(30)] + + +def _make_mock_client( + server_name: str, + tools: tuple[MCPToolInfo, ...], + config: MCPServerConfig | None = None, +) -> MCPClient: + """Create a mock MCPClient that returns given tools.""" + if config is None: + config = MCPServerConfig( + name=server_name, + transport="stdio", + command="echo", + ) + client = MCPClient(config) + client._session = AsyncMock() + return client + + +class TestFactoryCreateTools: + """Tool discovery and creation.""" + + async def test_create_tools_from_single_server(self) -> None: + config = MCPConfig( + servers=( + MCPServerConfig( + name="srv1", + transport="stdio", + command="echo", + ), + ), + ) + factory = MCPToolFactory(config) + + mock_tools = ( + MCPToolInfo( + name="tool-a", + description="Tool A", + server_name="srv1", + ), + ) + + with patch.object( + MCPToolFactory, + "_connect_and_discover", + new_callable=AsyncMock, + ) as mock_cad: + mock_client = _make_mock_client("srv1", mock_tools) + mock_cad.return_value = (mock_client, mock_tools) + tools = await factory.create_tools() + + assert len(tools) == 1 + assert isinstance(tools[0], MCPBridgeTool) + assert tools[0].name == "mcp_srv1_tool-a" + + async def test_create_tools_from_multiple_servers(self) -> None: + config = MCPConfig( + servers=( + MCPServerConfig( + name="srv1", + transport="stdio", + command="echo", + ), + MCPServerConfig( + name="srv2", + transport="streamable_http", + url="http://localhost", + ), + ), + ) + factory = MCPToolFactory(config) + + tools1 = ( + MCPToolInfo( + name="tool-a", + description="A", + server_name="srv1", + ), + ) + tools2 = ( + MCPToolInfo( + name="tool-b", + description="B", + server_name="srv2", + ), + MCPToolInfo( + name="tool-c", + description="C", + server_name="srv2", + ), + ) + + call_count = 0 + + async def mock_connect_discover( + cfg: MCPServerConfig, + ) -> tuple[MCPClient, tuple[MCPToolInfo, ...]]: + nonlocal call_count + call_count += 1 + if cfg.name == "srv1": + return (_make_mock_client("srv1", tools1), tools1) + return (_make_mock_client("srv2", tools2, cfg), tools2) + + with patch.object( + MCPToolFactory, + "_connect_and_discover", + side_effect=mock_connect_discover, + ): + tools = await factory.create_tools() + + assert len(tools) == 3 + assert call_count == 2 + + async def test_skip_disabled_servers(self) -> None: + config = MCPConfig( + servers=( + MCPServerConfig( + name="enabled", + transport="stdio", + command="echo", + ), + MCPServerConfig( + name="disabled", + transport="stdio", + command="echo", + enabled=False, + ), + ), + ) + factory = MCPToolFactory(config) + + tools1 = ( + MCPToolInfo( + name="tool-a", + description="A", + server_name="enabled", + ), + ) + + with patch.object( + MCPToolFactory, + "_connect_and_discover", + new_callable=AsyncMock, + ) as mock_cad: + mock_client = _make_mock_client("enabled", tools1) + mock_cad.return_value = (mock_client, tools1) + tools = await factory.create_tools() + + assert len(tools) == 1 + # Only called once (disabled server skipped) + mock_cad.assert_called_once() + + async def test_empty_config_returns_empty(self) -> None: + config = MCPConfig() + factory = MCPToolFactory(config) + tools = await factory.create_tools() + assert tools == () + + +class TestFactoryShutdown: + """Client lifecycle management.""" + + async def test_shutdown_disconnects_all_clients(self) -> None: + config = MCPConfig( + servers=( + MCPServerConfig( + name="srv1", + transport="stdio", + command="echo", + ), + ), + ) + factory = MCPToolFactory(config) + + tools1 = ( + MCPToolInfo( + name="tool-a", + description="A", + server_name="srv1", + ), + ) + + mock_client = _make_mock_client("srv1", tools1) + mock_client.disconnect = AsyncMock() # type: ignore[method-assign] + + with patch.object( + MCPToolFactory, + "_connect_and_discover", + new_callable=AsyncMock, + return_value=(mock_client, tools1), + ): + await factory.create_tools() + + await factory.shutdown() + mock_client.disconnect.assert_called_once() + + async def test_shutdown_clears_client_list(self) -> None: + config = MCPConfig() + factory = MCPToolFactory(config) + factory._clients = [MagicMock()] + factory._clients[0].disconnect = AsyncMock() # type: ignore[method-assign] + + await factory.shutdown() + assert factory._clients == [] + + +class TestFactoryReuseGuard: + """Cannot call create_tools twice.""" + + async def test_create_tools_twice_raises(self) -> None: + config = MCPConfig() + factory = MCPToolFactory(config) + await factory.create_tools() + with pytest.raises(RuntimeError, match="must not be called more than once"): + await factory.create_tools() + + +class TestFactoryPartialFailureCleanup: + """Partial failure in TaskGroup cleans up connected clients.""" + + async def test_connected_clients_disconnected_on_partial_failure( + self, + ) -> None: + config = MCPConfig( + servers=( + MCPServerConfig( + name="ok-srv", + transport="stdio", + command="echo", + ), + MCPServerConfig( + name="bad-srv", + transport="stdio", + command="echo", + ), + ), + ) + factory = MCPToolFactory(config) + ok_client = _make_mock_client("ok-srv", ()) + ok_client.disconnect = AsyncMock() # type: ignore[method-assign] + + msg = "server down" + + async def mock_connect_discover( + cfg: MCPServerConfig, + ) -> tuple[MCPClient, tuple[MCPToolInfo, ...]]: + if cfg.name == "ok-srv": + return (ok_client, ()) + raise ConnectionError(msg) + + with ( + patch.object( + MCPToolFactory, + "_connect_and_discover", + side_effect=mock_connect_discover, + ), + pytest.raises(ExceptionGroup, match="unhandled"), + ): + await factory.create_tools() + + ok_client.disconnect.assert_called_once() + + +class TestFactoryShutdownSwallowsErrors: + """Shutdown continues when one client fails to disconnect.""" + + async def test_shutdown_continues_after_disconnect_error(self) -> None: + config = MCPConfig() + factory = MCPToolFactory(config) + + client1 = MagicMock() + client1.disconnect = AsyncMock( + side_effect=RuntimeError("disconnect broke"), + ) + client1.server_name = "client1" + client2 = MagicMock() + client2.disconnect = AsyncMock() + client2.server_name = "client2" + + factory._clients = [client1, client2] + await factory.shutdown() + + client1.disconnect.assert_called_once() + client2.disconnect.assert_called_once() + assert factory._clients == [] + + +class TestFactoryMakeCache: + """Cache creation logic.""" + + def test_make_cache_returns_none_when_disabled(self) -> None: + config = MCPServerConfig( + name="no-cache", + transport="stdio", + command="echo", + result_cache_max_size=0, + ) + client = MCPClient(config) + cache = MCPToolFactory._make_cache(client) + assert cache is None + + def test_make_cache_returns_cache_when_enabled(self) -> None: + config = MCPServerConfig( + name="cached", + transport="stdio", + command="echo", + result_cache_max_size=128, + result_cache_ttl_seconds=30.0, + ) + client = MCPClient(config) + cache = MCPToolFactory._make_cache(client) + assert cache is not None diff --git a/tests/unit/tools/mcp/test_result_mapper.py b/tests/unit/tools/mcp/test_result_mapper.py new file mode 100644 index 0000000000..81902af9bb --- /dev/null +++ b/tests/unit/tools/mcp/test_result_mapper.py @@ -0,0 +1,215 @@ +"""Tests for MCP result mapping (ADR-002 D18).""" + +import pytest +from mcp.types import ( + AudioContent, + EmbeddedResource, + ImageContent, + TextContent, + TextResourceContents, +) + +from ai_company.tools.mcp.models import MCPRawResult +from ai_company.tools.mcp.result_mapper import map_call_tool_result + +pytestmark = [pytest.mark.unit, pytest.mark.timeout(30)] + + +class TestTextContentMapping: + """TextContent blocks map to content string.""" + + def test_single_text(self) -> None: + raw = MCPRawResult( + content=(TextContent(type="text", text="hello"),), + ) + result = map_call_tool_result(raw) + assert result.content == "hello" + assert not result.is_error + + def test_multiple_texts_joined(self) -> None: + raw = MCPRawResult( + content=( + TextContent(type="text", text="line 1"), + TextContent(type="text", text="line 2"), + ), + ) + result = map_call_tool_result(raw) + assert result.content == "line 1\nline 2" + + def test_empty_text(self) -> None: + raw = MCPRawResult( + content=(TextContent(type="text", text=""),), + ) + result = map_call_tool_result(raw) + assert result.content == "" + + +class TestImageContentMapping: + """ImageContent blocks produce placeholders and attachments.""" + + def test_image_placeholder_and_attachment(self) -> None: + raw = MCPRawResult( + content=( + ImageContent( + type="image", + data="base64data", + mimeType="image/png", + ), + ), + ) + result = map_call_tool_result(raw) + assert result.content == "[image: image/png]" + attachments = result.metadata["attachments"] + assert len(attachments) == 1 + assert attachments[0]["type"] == "image" + assert attachments[0]["mimeType"] == "image/png" + assert attachments[0]["data"] == "base64data" + + +class TestAudioContentMapping: + """AudioContent blocks produce placeholders and attachments.""" + + def test_audio_placeholder_and_attachment(self) -> None: + raw = MCPRawResult( + content=( + AudioContent( + type="audio", + data="audiodata", + mimeType="audio/mp3", + ), + ), + ) + result = map_call_tool_result(raw) + assert result.content == "[audio: audio/mp3]" + attachments = result.metadata["attachments"] + assert len(attachments) == 1 + assert attachments[0]["type"] == "audio" + assert attachments[0]["mimeType"] == "audio/mp3" + assert attachments[0]["data"] == "audiodata" + + +class TestEmbeddedResourceMapping: + """EmbeddedResource blocks produce resource placeholders.""" + + def test_resource_placeholder(self) -> None: + resource = TextResourceContents( + uri="file:///test.txt", # type: ignore[arg-type] + text="file content", + ) + raw = MCPRawResult( + content=( + EmbeddedResource( + type="resource", + resource=resource, + ), + ), + ) + result = map_call_tool_result(raw) + assert result.content == "[resource: file:///test.txt]" + assert "attachments" not in result.metadata + + +class TestStructuredContent: + """structuredContent maps to metadata.""" + + def test_structured_content_in_metadata(self) -> None: + raw = MCPRawResult( + content=(TextContent(type="text", text="ok"),), + structured_content={"key": "value"}, + ) + result = map_call_tool_result(raw) + assert result.metadata["structured_content"] == {"key": "value"} + + def test_no_structured_content(self) -> None: + raw = MCPRawResult( + content=(TextContent(type="text", text="ok"),), + ) + result = map_call_tool_result(raw) + assert "structured_content" not in result.metadata + + +class TestIsErrorMapping: + """isError maps 1:1 to is_error.""" + + def test_error_true(self) -> None: + raw = MCPRawResult( + content=(TextContent(type="text", text="err"),), + is_error=True, + ) + result = map_call_tool_result(raw) + assert result.is_error + + def test_error_false(self) -> None: + raw = MCPRawResult( + content=(TextContent(type="text", text="ok"),), + is_error=False, + ) + result = map_call_tool_result(raw) + assert not result.is_error + + +class TestEmptyContent: + """Empty content produces empty string.""" + + def test_empty_content_tuple(self) -> None: + raw = MCPRawResult(content=()) + result = map_call_tool_result(raw) + assert result.content == "" + assert not result.is_error + + +class TestUnknownContentBlock: + """Unknown content block types produce placeholders.""" + + def test_unknown_block_produces_placeholder(self) -> None: + from unittest.mock import MagicMock + + unknown = MagicMock() + type(unknown).__name__ = "MysteryBlock" + raw = MCPRawResult(content=(unknown,)) + result = map_call_tool_result(raw) + assert "[unknown: MysteryBlock]" in result.content + assert "attachments" not in result.metadata + + +class TestMixedContent: + """Mixed content types in a single result.""" + + def test_text_and_image_mixed(self) -> None: + raw = MCPRawResult( + content=( + TextContent(type="text", text="header"), + ImageContent( + type="image", + data="imgdata", + mimeType="image/jpeg", + ), + TextContent(type="text", text="footer"), + ), + ) + result = map_call_tool_result(raw) + lines = result.content.split("\n") + assert lines[0] == "header" + assert lines[1] == "[image: image/jpeg]" + assert lines[2] == "footer" + assert len(result.metadata["attachments"]) == 1 + + def test_image_and_audio_combined(self) -> None: + raw = MCPRawResult( + content=( + ImageContent( + type="image", + data="img", + mimeType="image/png", + ), + AudioContent( + type="audio", + data="aud", + mimeType="audio/wav", + ), + ), + ) + result = map_call_tool_result(raw) + assert "[image: image/png]" in result.content + assert "[audio: audio/wav]" in result.content + assert len(result.metadata["attachments"]) == 2 diff --git a/tests/unit/tools/sandbox/test_docker_config.py b/tests/unit/tools/sandbox/test_docker_config.py new file mode 100644 index 0000000000..6a73a750bb --- /dev/null +++ b/tests/unit/tools/sandbox/test_docker_config.py @@ -0,0 +1,115 @@ +"""Tests for DockerSandboxConfig validation.""" + +import pytest +from pydantic import ValidationError + +from ai_company.tools.sandbox.docker_config import DockerSandboxConfig + +pytestmark = [pytest.mark.unit, pytest.mark.timeout(30)] + + +class TestDockerSandboxConfigDefaults: + """Default values are sensible.""" + + def test_defaults(self) -> None: + config = DockerSandboxConfig() + assert config.image == "ai-company-sandbox:latest" + assert config.network == "none" + assert config.network_overrides == {} + assert config.allowed_hosts == () + assert config.memory_limit == "512m" + assert config.cpu_limit == 1.0 + assert config.timeout_seconds == 120.0 + assert config.mount_mode == "ro" + assert config.runtime is None + + def test_frozen(self) -> None: + config = DockerSandboxConfig() + with pytest.raises(ValidationError): + config.image = "other:latest" # type: ignore[misc] + + +class TestDockerSandboxConfigCustomValues: + """Custom values are accepted within bounds.""" + + def test_custom_image(self) -> None: + config = DockerSandboxConfig(image="custom:v1") + assert config.image == "custom:v1" + + @pytest.mark.parametrize("network", ["none", "bridge", "host"]) + def test_valid_network_modes(self, network: str) -> None: + config = DockerSandboxConfig(network=network) # type: ignore[arg-type] + assert config.network == network + + def test_invalid_network_mode(self) -> None: + with pytest.raises(ValidationError): + DockerSandboxConfig(network="overlay") # type: ignore[arg-type] + + def test_network_overrides(self) -> None: + overrides = {"web": "bridge", "data": "none"} + config = DockerSandboxConfig(network_overrides=overrides) + assert config.network_overrides == overrides + + def test_allowed_hosts(self) -> None: + hosts = ("api.example.com:443", "db.internal:5432") + config = DockerSandboxConfig(allowed_hosts=hosts) + assert config.allowed_hosts == hosts + + @pytest.mark.parametrize("mount_mode", ["rw", "ro"]) + def test_valid_mount_modes(self, mount_mode: str) -> None: + config = DockerSandboxConfig(mount_mode=mount_mode) # type: ignore[arg-type] + assert config.mount_mode == mount_mode + + def test_invalid_mount_mode(self) -> None: + with pytest.raises(ValidationError): + DockerSandboxConfig(mount_mode="wx") # type: ignore[arg-type] + + def test_runtime_gvisor(self) -> None: + config = DockerSandboxConfig(runtime="runsc") + assert config.runtime == "runsc" + + +class TestDockerSandboxConfigBounds: + """Field bounds are enforced.""" + + def test_cpu_limit_zero_rejected(self) -> None: + with pytest.raises(ValidationError, match="cpu_limit"): + DockerSandboxConfig(cpu_limit=0) + + def test_cpu_limit_exceeds_max_rejected(self) -> None: + with pytest.raises(ValidationError, match="cpu_limit"): + DockerSandboxConfig(cpu_limit=17) + + def test_cpu_limit_at_max(self) -> None: + config = DockerSandboxConfig(cpu_limit=16) + assert config.cpu_limit == 16 + + def test_timeout_zero_rejected(self) -> None: + with pytest.raises(ValidationError, match="timeout_seconds"): + DockerSandboxConfig(timeout_seconds=0) + + def test_timeout_exceeds_max_rejected(self) -> None: + with pytest.raises(ValidationError, match="timeout_seconds"): + DockerSandboxConfig(timeout_seconds=601) + + def test_timeout_at_max(self) -> None: + config = DockerSandboxConfig(timeout_seconds=600) + assert config.timeout_seconds == 600 + + def test_blank_image_rejected(self) -> None: + with pytest.raises(ValidationError): + DockerSandboxConfig(image="") + + def test_whitespace_image_rejected(self) -> None: + with pytest.raises(ValidationError): + DockerSandboxConfig(image=" ") + + def test_blank_memory_limit_rejected(self) -> None: + with pytest.raises(ValidationError): + DockerSandboxConfig(memory_limit="") + + def test_invalid_network_override_value_rejected(self) -> None: + with pytest.raises(ValidationError, match="Invalid network mode"): + DockerSandboxConfig( + network_overrides={"web": "overlay"}, + ) diff --git a/tests/unit/tools/sandbox/test_docker_sandbox.py b/tests/unit/tools/sandbox/test_docker_sandbox.py new file mode 100644 index 0000000000..a5381b270d --- /dev/null +++ b/tests/unit/tools/sandbox/test_docker_sandbox.py @@ -0,0 +1,660 @@ +"""Tests for DockerSandbox with mocked aiodocker.""" + +import asyncio +from contextlib import contextmanager +from pathlib import Path, PurePosixPath +from typing import TYPE_CHECKING, Any +from unittest.mock import AsyncMock, MagicMock, patch + +if TYPE_CHECKING: + from collections.abc import Iterator + +import pytest + +from ai_company.tools.sandbox.docker_config import DockerSandboxConfig +from ai_company.tools.sandbox.docker_sandbox import ( + DockerSandbox, + _to_posix_bind_path, +) +from ai_company.tools.sandbox.errors import SandboxError, SandboxStartError + +pytestmark = [pytest.mark.unit, pytest.mark.timeout(30)] + +_DOCKER_MODULE = "ai_company.tools.sandbox.docker_sandbox.aiodocker" + + +# ── Helpers ────────────────────────────────────────────────────── + + +def _make_mock_docker() -> MagicMock: + """Create a mock aiodocker.Docker client.""" + mock_docker = MagicMock() + mock_docker.version = AsyncMock(return_value={"ApiVersion": "1.43"}) + mock_docker.close = AsyncMock() + + # containers namespace + mock_containers = MagicMock() + mock_docker.containers = mock_containers + + # create() returns a container object with .id property + mock_created_container = MagicMock() + mock_created_container.id = "abc123def456" + mock_containers.create = AsyncMock( + return_value=mock_created_container, + ) + + # container object returned by .container(id) + mock_container_obj = MagicMock() + mock_container_obj.start = AsyncMock() + mock_container_obj.wait = AsyncMock( + return_value={"StatusCode": 0}, + ) + mock_container_obj.log = AsyncMock(return_value=["output line\n"]) + mock_container_obj.stop = AsyncMock() + mock_container_obj.delete = AsyncMock() + + mock_containers.container = MagicMock( + return_value=mock_container_obj, + ) + + return mock_docker + + +@contextmanager +def _patch_aiodocker( + mock_docker: MagicMock, +) -> Iterator[Any]: + """Create a patch for aiodocker.Docker that returns mock_docker.""" + mock_module = MagicMock() + mock_module.Docker = MagicMock(return_value=mock_docker) + with patch(_DOCKER_MODULE, mock_module) as p: + yield p + + +# ── Constructor ────────────────────────────────────────────────── + + +class TestDockerSandboxInit: + """Constructor validation.""" + + def test_workspace_must_be_absolute(self, tmp_path: Path) -> None: + with pytest.raises(ValueError, match="absolute path"): + DockerSandbox(workspace=Path("relative")) + + def test_workspace_must_exist(self, tmp_path: Path) -> None: + missing = tmp_path / "nonexistent" + with pytest.raises(ValueError, match="does not exist"): + DockerSandbox(workspace=missing) + + def test_valid_workspace(self, tmp_path: Path) -> None: + sandbox = DockerSandbox(workspace=tmp_path) + assert sandbox.workspace == tmp_path.resolve() + + def test_default_config(self, tmp_path: Path) -> None: + sandbox = DockerSandbox(workspace=tmp_path) + assert sandbox.config.image == "ai-company-sandbox:latest" + assert sandbox.config.timeout_seconds == 120.0 + + def test_custom_config(self, tmp_path: Path) -> None: + config = DockerSandboxConfig(image="custom:v1", cpu_limit=2.0) + sandbox = DockerSandbox(config=config, workspace=tmp_path) + assert sandbox.config.image == "custom:v1" + assert sandbox.config.cpu_limit == 2.0 + + +# ── CWD Validation ────────────────────────────────────────────── + + +class TestDockerSandboxCwdValidation: + """Workspace boundary enforcement.""" + + def test_cwd_within_workspace_accepted( + self, + tmp_path: Path, + ) -> None: + subdir = tmp_path / "sub" + subdir.mkdir() + sandbox = DockerSandbox(workspace=tmp_path) + # Should not raise + sandbox._validate_cwd(subdir) + + def test_workspace_root_accepted(self, tmp_path: Path) -> None: + sandbox = DockerSandbox(workspace=tmp_path) + sandbox._validate_cwd(tmp_path) + + def test_cwd_outside_workspace_rejected( + self, + tmp_path: Path, + ) -> None: + outside = tmp_path.parent / "outside" + outside.mkdir(exist_ok=True) + sandbox = DockerSandbox(workspace=tmp_path) + with pytest.raises(SandboxError, match="outside workspace"): + sandbox._validate_cwd(outside) + + +# ── Execute ───────────────────────────────────────────────────── + + +class TestDockerSandboxExecute: + """Execute with mocked Docker daemon.""" + + async def test_execute_success(self, tmp_path: Path) -> None: + mock_docker = _make_mock_docker() + sandbox = DockerSandbox(workspace=tmp_path) + + with _patch_aiodocker(mock_docker): + result = await sandbox.execute( + command="echo", + args=("hello",), + ) + + assert result.success + assert result.stdout == "output line\n" + assert result.returncode == 0 + assert not result.timed_out + + async def test_execute_failure(self, tmp_path: Path) -> None: + mock_docker = _make_mock_docker() + container_obj = mock_docker.containers.container() + container_obj.wait = AsyncMock( + return_value={"StatusCode": 1}, + ) + container_obj.log = AsyncMock( + return_value=["error occurred\n"], + ) + + sandbox = DockerSandbox(workspace=tmp_path) + + with _patch_aiodocker(mock_docker): + result = await sandbox.execute( + command="false", + args=(), + ) + + assert not result.success + assert result.returncode == 1 + + async def test_execute_timeout(self, tmp_path: Path) -> None: + mock_docker = _make_mock_docker() + container_obj = mock_docker.containers.container() + container_obj.wait = AsyncMock( + side_effect=asyncio.TimeoutError, + ) + container_obj.log = AsyncMock( + return_value=["partial output\n"], + ) + + sandbox = DockerSandbox(workspace=tmp_path) + + with _patch_aiodocker(mock_docker): + result = await sandbox.execute( + command="sleep", + args=("100",), + timeout=1.0, + ) + + assert result.timed_out + assert not result.success + container_obj.stop.assert_awaited_once() + + async def test_execute_with_env_overrides( + self, + tmp_path: Path, + ) -> None: + mock_docker = _make_mock_docker() + sandbox = DockerSandbox(workspace=tmp_path) + + with _patch_aiodocker(mock_docker): + await sandbox.execute( + command="env", + args=(), + env_overrides={"MY_VAR": "hello"}, + ) + + create_call = mock_docker.containers.create.call_args + config = create_call[0][0] + assert "MY_VAR=hello" in config["Env"] + + async def test_execute_cwd_outside_workspace( + self, + tmp_path: Path, + ) -> None: + outside = tmp_path.parent / "escape" + outside.mkdir(exist_ok=True) + sandbox = DockerSandbox(workspace=tmp_path) + + with pytest.raises(SandboxError, match="outside workspace"): + await sandbox.execute( + command="echo", + args=("test",), + cwd=outside, + ) + + async def test_execute_custom_cwd_within_workspace( + self, + tmp_path: Path, + ) -> None: + subdir = tmp_path / "project" + subdir.mkdir() + mock_docker = _make_mock_docker() + sandbox = DockerSandbox(workspace=tmp_path) + + with _patch_aiodocker(mock_docker): + await sandbox.execute( + command="ls", + args=(), + cwd=subdir, + ) + + create_call = mock_docker.containers.create.call_args + config = create_call[0][0] + assert config["WorkingDir"] == "/workspace/project" + + async def test_docker_unavailable_raises_start_error( + self, + tmp_path: Path, + ) -> None: + sandbox = DockerSandbox(workspace=tmp_path) + + mock_client = MagicMock() + mock_client.version = AsyncMock( + side_effect=ConnectionError("refused"), + ) + mock_client.close = AsyncMock() + mock_module = MagicMock() + mock_module.Docker = MagicMock(return_value=mock_client) + + with ( + patch(_DOCKER_MODULE, mock_module), + pytest.raises( + SandboxStartError, + match="Docker daemon unavailable", + ), + ): + await sandbox.execute( + command="echo", + args=("test",), + ) + + async def test_image_not_found_raises_start_error( + self, + tmp_path: Path, + ) -> None: + mock_docker = _make_mock_docker() + mock_docker.containers.create = AsyncMock( + side_effect=Exception("image not found"), + ) + sandbox = DockerSandbox(workspace=tmp_path) + + with ( + _patch_aiodocker(mock_docker), + pytest.raises( + SandboxStartError, + match="Failed to create container", + ), + ): + await sandbox.execute( + command="echo", + args=("test",), + ) + + async def test_oom_kill_returncode_137( + self, + tmp_path: Path, + ) -> None: + mock_docker = _make_mock_docker() + container_obj = mock_docker.containers.container() + container_obj.wait = AsyncMock( + return_value={"StatusCode": 137}, + ) + container_obj.log = AsyncMock(return_value=[""]) + + sandbox = DockerSandbox(workspace=tmp_path) + + with _patch_aiodocker(mock_docker): + result = await sandbox.execute( + command="stress", + args=("--vm", "1"), + ) + + assert result.returncode == 137 + assert not result.success + + +# ── Container Config ──────────────────────────────────────────── + + +class TestDockerSandboxContainerConfig: + """Container configuration building.""" + + def test_mount_mode_rw(self, tmp_path: Path) -> None: + config = DockerSandboxConfig(mount_mode="rw") + sandbox = DockerSandbox(config=config, workspace=tmp_path) + result = sandbox._build_container_config( + command="echo", + args=("hi",), + container_cwd="/workspace", + env_overrides=None, + ) + bind = result["HostConfig"]["Binds"][0] + assert bind.endswith(":rw") + + def test_mount_mode_ro(self, tmp_path: Path) -> None: + config = DockerSandboxConfig(mount_mode="ro") + sandbox = DockerSandbox(config=config, workspace=tmp_path) + result = sandbox._build_container_config( + command="echo", + args=("hi",), + container_cwd="/workspace", + env_overrides=None, + ) + bind = result["HostConfig"]["Binds"][0] + assert bind.endswith(":ro") + + def test_runtime_included_when_set(self, tmp_path: Path) -> None: + config = DockerSandboxConfig(runtime="runsc") + sandbox = DockerSandbox(config=config, workspace=tmp_path) + result = sandbox._build_container_config( + command="echo", + args=(), + container_cwd="/workspace", + env_overrides=None, + ) + assert result["HostConfig"]["Runtime"] == "runsc" + + def test_runtime_excluded_when_none( + self, + tmp_path: Path, + ) -> None: + config = DockerSandboxConfig(runtime=None) + sandbox = DockerSandbox(config=config, workspace=tmp_path) + result = sandbox._build_container_config( + command="echo", + args=(), + container_cwd="/workspace", + env_overrides=None, + ) + assert "Runtime" not in result["HostConfig"] + + def test_network_mode_set(self, tmp_path: Path) -> None: + config = DockerSandboxConfig(network="bridge") + sandbox = DockerSandbox(config=config, workspace=tmp_path) + result = sandbox._build_container_config( + command="echo", + args=(), + container_cwd="/workspace", + env_overrides=None, + ) + assert result["HostConfig"]["NetworkMode"] == "bridge" + + +# ── Cleanup ───────────────────────────────────────────────────── + + +class TestDockerSandboxCleanup: + """Cleanup and resource release.""" + + async def test_cleanup_closes_docker_session( + self, + tmp_path: Path, + ) -> None: + mock_docker = _make_mock_docker() + sandbox = DockerSandbox(workspace=tmp_path) + sandbox._docker = mock_docker + + await sandbox.cleanup() + + mock_docker.close.assert_awaited_once() + assert sandbox._docker is None + + async def test_cleanup_without_connection( + self, + tmp_path: Path, + ) -> None: + sandbox = DockerSandbox(workspace=tmp_path) + # Should not raise + await sandbox.cleanup() + + async def test_cleanup_stops_tracked_containers( + self, + tmp_path: Path, + ) -> None: + mock_docker = _make_mock_docker() + sandbox = DockerSandbox(workspace=tmp_path) + sandbox._docker = mock_docker + sandbox._tracked_containers = ["container1", "container2"] + + await sandbox.cleanup() + + container_obj = mock_docker.containers.container.return_value + assert container_obj.stop.await_count == 2 + assert container_obj.delete.await_count == 2 + assert sandbox._tracked_containers == [] + + +# ── Health check ──────────────────────────────────────────────── + + +class TestDockerSandboxHealthCheck: + """Health check behavior.""" + + async def test_health_check_success(self, tmp_path: Path) -> None: + mock_docker = _make_mock_docker() + sandbox = DockerSandbox(workspace=tmp_path) + sandbox._docker = mock_docker + + assert await sandbox.health_check() is True + + async def test_health_check_failure( + self, + tmp_path: Path, + ) -> None: + sandbox = DockerSandbox(workspace=tmp_path) + + mock_client = MagicMock() + mock_client.version = AsyncMock( + side_effect=ConnectionError("refused"), + ) + mock_client.close = AsyncMock() + mock_module = MagicMock() + mock_module.Docker = MagicMock(return_value=mock_client) + + with patch(_DOCKER_MODULE, mock_module): + assert await sandbox.health_check() is False + + +# ── Backend type ──────────────────────────────────────────────── + + +class TestDockerSandboxBackendType: + """Backend type identifier.""" + + def test_returns_docker(self, tmp_path: Path) -> None: + sandbox = DockerSandbox(workspace=tmp_path) + assert sandbox.get_backend_type() == "docker" + + +# ── Windows path conversion ───────────────────────────────────── + + +class TestWindowsPathConversion: + """Path conversion for Docker bind mounts.""" + + def test_unix_path_unchanged(self) -> None: + with patch( + "ai_company.tools.sandbox.docker_sandbox.platform.system", + return_value="Linux", + ): + # Use PurePosixPath to avoid Windows path normalisation + posix_path = PurePosixPath("/home/user/workspace") + result = _to_posix_bind_path(posix_path) # type: ignore[arg-type] + assert result == "/home/user/workspace" + + def test_windows_path_converted(self) -> None: + with patch( + "ai_company.tools.sandbox.docker_sandbox.platform.system", + return_value="Windows", + ): + win_path = Path("C:/Users/test/workspace") + result = _to_posix_bind_path(win_path) + assert result.startswith("/c/") + assert "Users" in result + assert "test" in result + + def test_windows_path_lowercase_drive(self) -> None: + with patch( + "ai_company.tools.sandbox.docker_sandbox.platform.system", + return_value="Windows", + ): + win_path = Path("D:/Projects/app") + result = _to_posix_bind_path(win_path) + assert result.startswith("/d/") + + +# ── Memory limit parsing ──────────────────────────────────────── + + +class TestMemoryLimitParsing: + """DockerSandbox._parse_memory_limit.""" + + @pytest.mark.parametrize( + ("limit", "expected"), + [ + ("512m", 512 * 1024**2), + ("1g", 1024**3), + ("256k", 256 * 1024), + ("1024", 1024), + ("2G", 2 * 1024**3), + ], + ) + def test_parse_memory_limit( + self, + limit: str, + expected: int, + ) -> None: + assert DockerSandbox._parse_memory_limit(limit) == expected + + @pytest.mark.parametrize( + "invalid_limit", + ["", " ", "abc", "512x", "0m", "-1g"], + ) + def test_parse_memory_limit_invalid( + self, + invalid_limit: str, + ) -> None: + with pytest.raises(ValueError, match=r"[Mm]emory|invalid literal"): + DockerSandbox._parse_memory_limit(invalid_limit) + + +# ── Container hardening ──────────────────────────────────────── + + +class TestDockerSandboxHardening: + """Security hardening in container config.""" + + def test_tmpfs_mount_for_tmp(self, tmp_path: Path) -> None: + sandbox = DockerSandbox(workspace=tmp_path) + config = sandbox._build_container_config( + command="echo", + args=(), + container_cwd="/workspace", + env_overrides=None, + ) + assert "/tmp" in config["HostConfig"]["Tmpfs"] # noqa: S108 + + def test_pids_limit_set(self, tmp_path: Path) -> None: + sandbox = DockerSandbox(workspace=tmp_path) + config = sandbox._build_container_config( + command="echo", + args=(), + container_cwd="/workspace", + env_overrides=None, + ) + assert config["HostConfig"]["PidsLimit"] == 64 + + def test_readonly_rootfs_set(self, tmp_path: Path) -> None: + sandbox = DockerSandbox(workspace=tmp_path) + config = sandbox._build_container_config( + command="echo", + args=(), + container_cwd="/workspace", + env_overrides=None, + ) + assert config["HostConfig"]["ReadonlyRootfs"] is True + + def test_cap_drop_all(self, tmp_path: Path) -> None: + sandbox = DockerSandbox(workspace=tmp_path) + config = sandbox._build_container_config( + command="echo", + args=(), + container_cwd="/workspace", + env_overrides=None, + ) + assert config["HostConfig"]["CapDrop"] == ["ALL"] + + +# ── Stop/remove exception handling ───────────────────────────── + + +class TestDockerSandboxContainerErrorHandling: + """Container stop/remove error paths.""" + + async def test_stop_container_swallows_exception( + self, + tmp_path: Path, + ) -> None: + mock_docker = _make_mock_docker() + container_obj = mock_docker.containers.container() + container_obj.stop = AsyncMock( + side_effect=RuntimeError("already stopped"), + ) + sandbox = DockerSandbox(workspace=tmp_path) + # Should not raise + await sandbox._stop_container(mock_docker, "abc123def456") + + async def test_remove_container_swallows_exception( + self, + tmp_path: Path, + ) -> None: + mock_docker = _make_mock_docker() + container_obj = mock_docker.containers.container() + container_obj.delete = AsyncMock( + side_effect=RuntimeError("already removed"), + ) + sandbox = DockerSandbox(workspace=tmp_path) + # Should not raise + await sandbox._remove_container(mock_docker, "abc123def456") + + async def test_tracked_containers_pruned_after_execute( + self, + tmp_path: Path, + ) -> None: + mock_docker = _make_mock_docker() + sandbox = DockerSandbox(workspace=tmp_path) + + with _patch_aiodocker(mock_docker): + await sandbox.execute(command="echo", args=("hi",)) + + # Container should be removed from tracking after execute + assert sandbox._tracked_containers == [] + + async def test_start_failure_raises_sandbox_start_error( + self, + tmp_path: Path, + ) -> None: + mock_docker = _make_mock_docker() + container_obj = mock_docker.containers.container() + container_obj.start = AsyncMock( + side_effect=RuntimeError("OOM at start"), + ) + sandbox = DockerSandbox(workspace=tmp_path) + + with ( + _patch_aiodocker(mock_docker), + pytest.raises( + SandboxStartError, + match="Failed to start container", + ), + ): + await sandbox.execute(command="echo", args=("test",)) diff --git a/tests/unit/tools/sandbox/test_protocol.py b/tests/unit/tools/sandbox/test_protocol.py index c224cdc0fd..534cf64dd1 100644 --- a/tests/unit/tools/sandbox/test_protocol.py +++ b/tests/unit/tools/sandbox/test_protocol.py @@ -1,11 +1,12 @@ """Tests for SandboxBackend protocol.""" from collections.abc import Mapping # noqa: TC003 — used at runtime -from pathlib import Path # noqa: TC003 — used at runtime +from pathlib import Path # noqa: TC003 — used at runtime by DockerSandbox import pytest from ai_company.core.types import NotBlankStr +from ai_company.tools.sandbox.docker_sandbox import DockerSandbox from ai_company.tools.sandbox.protocol import SandboxBackend from ai_company.tools.sandbox.result import SandboxResult from ai_company.tools.sandbox.subprocess_sandbox import SubprocessSandbox # noqa: TC001 @@ -53,5 +54,12 @@ def test_subprocess_sandbox_satisfies_protocol( ) -> None: assert isinstance(subprocess_sandbox, SandboxBackend) + def test_docker_sandbox_satisfies_protocol( + self, + tmp_path: Path, + ) -> None: + sandbox = DockerSandbox(workspace=tmp_path) + assert isinstance(sandbox, SandboxBackend) + def test_arbitrary_object_does_not_satisfy(self) -> None: assert not isinstance(object(), SandboxBackend) diff --git a/tests/unit/tools/sandbox/test_sandboxing_config.py b/tests/unit/tools/sandbox/test_sandboxing_config.py new file mode 100644 index 0000000000..b75f218593 --- /dev/null +++ b/tests/unit/tools/sandbox/test_sandboxing_config.py @@ -0,0 +1,98 @@ +"""Tests for SandboxingConfig validation.""" + +import pytest +from pydantic import ValidationError + +from ai_company.tools.sandbox.docker_config import DockerSandboxConfig +from ai_company.tools.sandbox.sandboxing_config import SandboxingConfig + +pytestmark = [pytest.mark.unit, pytest.mark.timeout(30)] + + +class TestSandboxingConfigDefaults: + """Default values and frozen behavior.""" + + def test_defaults(self) -> None: + config = SandboxingConfig() + assert config.default_backend == "subprocess" + assert config.overrides == {} + assert config.subprocess.timeout_seconds == 30.0 + assert config.docker.image == "ai-company-sandbox:latest" + + def test_frozen(self) -> None: + config = SandboxingConfig() + with pytest.raises(ValidationError): + config.default_backend = "docker" # type: ignore[misc] + + +class TestSandboxingConfigCustomValues: + """Custom values are accepted.""" + + def test_docker_default_backend(self) -> None: + config = SandboxingConfig(default_backend="docker") + assert config.default_backend == "docker" + + def test_invalid_backend(self) -> None: + with pytest.raises(ValidationError): + SandboxingConfig(default_backend="kubernetes") # type: ignore[arg-type] + + def test_overrides(self) -> None: + overrides: dict[str, str] = { + "code_execution": "docker", + "terminal": "subprocess", + } + config = SandboxingConfig(overrides=overrides) # type: ignore[arg-type] + assert config.overrides == overrides + + def test_invalid_override_backend_rejected(self) -> None: + with pytest.raises(ValidationError, match="literal_error"): + SandboxingConfig( + overrides={"code_execution": "kubernetes"}, # type: ignore[dict-item] + ) + + def test_custom_docker_config(self) -> None: + docker = DockerSandboxConfig(image="custom:v2", cpu_limit=4.0) + config = SandboxingConfig(docker=docker) + assert config.docker.image == "custom:v2" + assert config.docker.cpu_limit == 4.0 + + +class TestBackendForCategory: + """backend_for_category routing logic.""" + + def test_returns_default_when_no_override(self) -> None: + config = SandboxingConfig(default_backend="subprocess") + assert config.backend_for_category("file_system") == "subprocess" + + def test_returns_override_when_present(self) -> None: + config = SandboxingConfig( + default_backend="subprocess", + overrides={"code_execution": "docker"}, + ) + assert config.backend_for_category("code_execution") == "docker" + + def test_returns_default_for_unconfigured_category(self) -> None: + config = SandboxingConfig( + default_backend="docker", + overrides={"code_execution": "subprocess"}, + ) + assert config.backend_for_category("terminal") == "docker" + + @pytest.mark.parametrize( + ("default", "override_backend", "expected"), + [ + ("subprocess", "docker", "docker"), + ("docker", "subprocess", "subprocess"), + ], + ) + def test_parametrized_routing( + self, + default: str, + override_backend: str, + expected: str, + ) -> None: + config = SandboxingConfig( + default_backend=default, # type: ignore[arg-type] + overrides={"code_execution": override_backend}, # type: ignore[dict-item] + ) + assert config.backend_for_category("code_execution") == expected diff --git a/tests/unit/tools/test_code_runner.py b/tests/unit/tools/test_code_runner.py new file mode 100644 index 0000000000..636799bc83 --- /dev/null +++ b/tests/unit/tools/test_code_runner.py @@ -0,0 +1,238 @@ +"""Tests for CodeRunnerTool with mocked sandbox.""" + +from unittest.mock import AsyncMock, MagicMock + +import pytest + +from ai_company.core.enums import ToolCategory +from ai_company.tools.code_runner import CodeRunnerTool +from ai_company.tools.sandbox.result import SandboxResult + +pytestmark = [pytest.mark.unit, pytest.mark.timeout(30)] + + +# ── Helpers ────────────────────────────────────────────────────── + + +def _make_mock_sandbox( + *, + stdout: str = "output", + stderr: str = "", + returncode: int = 0, + timed_out: bool = False, +) -> MagicMock: + """Create a mock SandboxBackend with configurable result.""" + mock = MagicMock() + mock.execute = AsyncMock( + return_value=SandboxResult( + stdout=stdout, + stderr=stderr, + returncode=returncode, + timed_out=timed_out, + ), + ) + return mock + + +# ── Init ───────────────────────────────────────────────────────── + + +class TestCodeRunnerInit: + """Tool initialization.""" + + def test_name(self) -> None: + sandbox = _make_mock_sandbox() + tool = CodeRunnerTool(sandbox=sandbox) + assert tool.name == "code_runner" + + def test_category(self) -> None: + sandbox = _make_mock_sandbox() + tool = CodeRunnerTool(sandbox=sandbox) + assert tool.category == ToolCategory.CODE_EXECUTION + + def test_has_parameters_schema(self) -> None: + sandbox = _make_mock_sandbox() + tool = CodeRunnerTool(sandbox=sandbox) + schema = tool.parameters_schema + assert schema is not None + assert "code" in schema["properties"] + assert "language" in schema["properties"] + assert "timeout" in schema["properties"] + assert schema["required"] == ["code", "language"] + + +# ── Language mapping ──────────────────────────────────────────── + + +class TestCodeRunnerLanguageMapping: + """Each language maps to the correct command.""" + + @pytest.mark.parametrize( + ("language", "expected_cmd", "expected_flag"), + [ + ("python", "python3", "-c"), + ("javascript", "node", "-e"), + ("bash", "bash", "-c"), + ], + ) + async def test_language_command_mapping( + self, + language: str, + expected_cmd: str, + expected_flag: str, + ) -> None: + sandbox = _make_mock_sandbox() + tool = CodeRunnerTool(sandbox=sandbox) + + await tool.execute( + arguments={"code": "print('hi')", "language": language}, + ) + + sandbox.execute.assert_awaited_once() + call_kwargs = sandbox.execute.call_args.kwargs + assert call_kwargs["command"] == expected_cmd + assert call_kwargs["args"] == (expected_flag, "print('hi')") + + +# ── Success execution ─────────────────────────────────────────── + + +class TestCodeRunnerSuccess: + """Successful code execution.""" + + async def test_success_returns_stdout(self) -> None: + sandbox = _make_mock_sandbox(stdout="Hello, World!") + tool = CodeRunnerTool(sandbox=sandbox) + + result = await tool.execute( + arguments={"code": "print('Hello, World!')", "language": "python"}, + ) + + assert not result.is_error + assert result.content == "Hello, World!" + assert result.metadata["returncode"] == 0 + assert result.metadata["language"] == "python" + + async def test_success_empty_stdout(self) -> None: + sandbox = _make_mock_sandbox(stdout="") + tool = CodeRunnerTool(sandbox=sandbox) + + result = await tool.execute( + arguments={"code": "pass", "language": "python"}, + ) + + assert not result.is_error + assert result.content == "(no output)" + + +# ── Error execution ───────────────────────────────────────────── + + +class TestCodeRunnerErrors: + """Error handling.""" + + async def test_nonzero_returncode(self) -> None: + sandbox = _make_mock_sandbox( + stderr="SyntaxError: invalid syntax", + returncode=1, + ) + tool = CodeRunnerTool(sandbox=sandbox) + + result = await tool.execute( + arguments={"code": "invalid(", "language": "python"}, + ) + + assert result.is_error + assert "SyntaxError" in result.content + assert result.metadata["returncode"] == 1 + + async def test_timeout_result(self) -> None: + sandbox = _make_mock_sandbox( + stderr="", + returncode=-1, + timed_out=True, + ) + tool = CodeRunnerTool(sandbox=sandbox) + + result = await tool.execute( + arguments={ + "code": "while True: pass", + "language": "python", + "timeout": 5.0, + }, + ) + + assert result.is_error + assert "timed out" in result.content.lower() + assert result.metadata["timed_out"] is True + + async def test_unsupported_language(self) -> None: + sandbox = _make_mock_sandbox() + tool = CodeRunnerTool(sandbox=sandbox) + + result = await tool.execute( + arguments={"code": "puts 'hi'", "language": "ruby"}, + ) + + assert result.is_error + assert "Unsupported language" in result.content + assert "ruby" in result.content + sandbox.execute.assert_not_awaited() + + +# ── Timeout forwarding ────────────────────────────────────────── + + +class TestCodeRunnerTimeout: + """Timeout parameter forwarding to sandbox.""" + + async def test_timeout_forwarded(self) -> None: + sandbox = _make_mock_sandbox() + tool = CodeRunnerTool(sandbox=sandbox) + + await tool.execute( + arguments={ + "code": "time.sleep(1)", + "language": "python", + "timeout": 42.0, + }, + ) + + call_kwargs = sandbox.execute.call_args.kwargs + assert call_kwargs["timeout"] == 42.0 + + async def test_no_timeout_passed_as_none(self) -> None: + sandbox = _make_mock_sandbox() + tool = CodeRunnerTool(sandbox=sandbox) + + await tool.execute( + arguments={"code": "print(1)", "language": "python"}, + ) + + call_kwargs = sandbox.execute.call_args.kwargs + assert call_kwargs["timeout"] is None + + +# ── Missing code parameter ────────────────────────────────────── + + +class TestCodeRunnerMissingParams: + """Behavior with missing required parameters.""" + + async def test_missing_code_raises(self) -> None: + sandbox = _make_mock_sandbox() + tool = CodeRunnerTool(sandbox=sandbox) + + with pytest.raises(KeyError, match="code"): + await tool.execute( + arguments={"language": "python"}, + ) + + async def test_missing_language_raises(self) -> None: + sandbox = _make_mock_sandbox() + tool = CodeRunnerTool(sandbox=sandbox) + + with pytest.raises(KeyError, match="language"): + await tool.execute( + arguments={"code": "print(1)"}, + ) diff --git a/uv.lock b/uv.lock index 4a217e22d8..8727d32d8f 100644 --- a/uv.lock +++ b/uv.lock @@ -6,11 +6,13 @@ requires-python = ">=3.14" name = "ai-company" source = { editable = "." } dependencies = [ + { name = "aiodocker" }, { name = "aiosqlite" }, { name = "jinja2" }, { name = "jsonschema" }, { name = "litellm" }, { name = "litestar", extra = ["brotli", "prometheus", "pydantic", "standard", "structlog"] }, + { name = "mcp" }, { name = "pydantic" }, { name = "pyyaml" }, { name = "structlog" }, @@ -46,11 +48,13 @@ test = [ [package.metadata] requires-dist = [ + { name = "aiodocker", specifier = "==0.26.0" }, { name = "aiosqlite", specifier = "==0.21.0" }, { name = "jinja2", specifier = "==3.1.6" }, { name = "jsonschema", specifier = "==4.26.0" }, { name = "litellm", specifier = "==1.82.0" }, { name = "litestar", extras = ["brotli", "prometheus", "pydantic", "standard", "structlog"], specifier = "==2.21.1" }, + { name = "mcp", specifier = "==1.26.0" }, { name = "pydantic", specifier = "==2.12.5" }, { name = "pyyaml", specifier = "==6.0.3" }, { name = "structlog", specifier = "==25.5.0" }, @@ -84,6 +88,18 @@ test = [ { name = "respx", specifier = "==0.22.0" }, ] +[[package]] +name = "aiodocker" +version = "0.26.0" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "aiohttp" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/96/64/3b724f650cedc6d5c1cad7fdae3dd3b53c976d075547ff5268e2f1e56db5/aiodocker-0.26.0.tar.gz", hash = "sha256:7e4ee40c6f98e6d1cf95d712f15aef3853087c0475bc7ea5ec499c2d485ba7ec", size = 162018, upload-time = "2026-02-17T18:53:23.234Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/3c/0d/e8f4164cbd950195900dd77aad23136f9ddc7feb0fb478d71839d1fda573/aiodocker-0.26.0-py3-none-any.whl", hash = "sha256:f450acf0b402a03c6a6f1517d06bdff739c404e62578b332182ca272fde6e986", size = 47825, upload-time = "2026-02-17T18:53:22.088Z" }, +] + [[package]] name = "aiohappyeyeballs" version = "2.6.1" @@ -243,6 +259,39 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/9a/3c/c17fb3ca2d9c3acff52e30b309f538586f9f5b9c9cf454f3845fc9af4881/certifi-2026.2.25-py3-none-any.whl", hash = "sha256:027692e4402ad994f1c42e52a4997a9763c646b73e4096e4d5d6db8af1d6f0fa", size = 153684, upload-time = "2026-02-25T02:54:15.766Z" }, ] +[[package]] +name = "cffi" +version = "2.0.0" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "pycparser", marker = "implementation_name != 'PyPy'" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/eb/56/b1ba7935a17738ae8453301356628e8147c79dbb825bcbc73dc7401f9846/cffi-2.0.0.tar.gz", hash = "sha256:44d1b5909021139fe36001ae048dbdde8214afa20200eda0f64c068cac5d5529", size = 523588, upload-time = "2025-09-08T23:24:04.541Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/92/c4/3ce07396253a83250ee98564f8d7e9789fab8e58858f35d07a9a2c78de9f/cffi-2.0.0-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:fc33c5141b55ed366cfaad382df24fe7dcbc686de5be719b207bb248e3053dc5", size = 185320, upload-time = "2025-09-08T23:23:18.087Z" }, + { url = "https://files.pythonhosted.org/packages/59/dd/27e9fa567a23931c838c6b02d0764611c62290062a6d4e8ff7863daf9730/cffi-2.0.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:c654de545946e0db659b3400168c9ad31b5d29593291482c43e3564effbcee13", size = 181487, upload-time = "2025-09-08T23:23:19.622Z" }, + { url = "https://files.pythonhosted.org/packages/d6/43/0e822876f87ea8a4ef95442c3d766a06a51fc5298823f884ef87aaad168c/cffi-2.0.0-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:24b6f81f1983e6df8db3adc38562c83f7d4a0c36162885ec7f7b77c7dcbec97b", size = 220049, upload-time = "2025-09-08T23:23:20.853Z" }, + { url = "https://files.pythonhosted.org/packages/b4/89/76799151d9c2d2d1ead63c2429da9ea9d7aac304603de0c6e8764e6e8e70/cffi-2.0.0-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.whl", hash = "sha256:12873ca6cb9b0f0d3a0da705d6086fe911591737a59f28b7936bdfed27c0d47c", size = 207793, upload-time = "2025-09-08T23:23:22.08Z" }, + { url = "https://files.pythonhosted.org/packages/bb/dd/3465b14bb9e24ee24cb88c9e3730f6de63111fffe513492bf8c808a3547e/cffi-2.0.0-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.whl", hash = "sha256:d9b97165e8aed9272a6bb17c01e3cc5871a594a446ebedc996e2397a1c1ea8ef", size = 206300, upload-time = "2025-09-08T23:23:23.314Z" }, + { url = "https://files.pythonhosted.org/packages/47/d9/d83e293854571c877a92da46fdec39158f8d7e68da75bf73581225d28e90/cffi-2.0.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:afb8db5439b81cf9c9d0c80404b60c3cc9c3add93e114dcae767f1477cb53775", size = 219244, upload-time = "2025-09-08T23:23:24.541Z" }, + { url = "https://files.pythonhosted.org/packages/2b/0f/1f177e3683aead2bb00f7679a16451d302c436b5cbf2505f0ea8146ef59e/cffi-2.0.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:737fe7d37e1a1bffe70bd5754ea763a62a066dc5913ca57e957824b72a85e205", size = 222828, upload-time = "2025-09-08T23:23:26.143Z" }, + { url = "https://files.pythonhosted.org/packages/c6/0f/cafacebd4b040e3119dcb32fed8bdef8dfe94da653155f9d0b9dc660166e/cffi-2.0.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:38100abb9d1b1435bc4cc340bb4489635dc2f0da7456590877030c9b3d40b0c1", size = 220926, upload-time = "2025-09-08T23:23:27.873Z" }, + { url = "https://files.pythonhosted.org/packages/3e/aa/df335faa45b395396fcbc03de2dfcab242cd61a9900e914fe682a59170b1/cffi-2.0.0-cp314-cp314-win32.whl", hash = "sha256:087067fa8953339c723661eda6b54bc98c5625757ea62e95eb4898ad5e776e9f", size = 175328, upload-time = "2025-09-08T23:23:44.61Z" }, + { url = "https://files.pythonhosted.org/packages/bb/92/882c2d30831744296ce713f0feb4c1cd30f346ef747b530b5318715cc367/cffi-2.0.0-cp314-cp314-win_amd64.whl", hash = "sha256:203a48d1fb583fc7d78a4c6655692963b860a417c0528492a6bc21f1aaefab25", size = 185650, upload-time = "2025-09-08T23:23:45.848Z" }, + { url = "https://files.pythonhosted.org/packages/9f/2c/98ece204b9d35a7366b5b2c6539c350313ca13932143e79dc133ba757104/cffi-2.0.0-cp314-cp314-win_arm64.whl", hash = "sha256:dbd5c7a25a7cb98f5ca55d258b103a2054f859a46ae11aaf23134f9cc0d356ad", size = 180687, upload-time = "2025-09-08T23:23:47.105Z" }, + { url = "https://files.pythonhosted.org/packages/3e/61/c768e4d548bfa607abcda77423448df8c471f25dbe64fb2ef6d555eae006/cffi-2.0.0-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:9a67fc9e8eb39039280526379fb3a70023d77caec1852002b4da7e8b270c4dd9", size = 188773, upload-time = "2025-09-08T23:23:29.347Z" }, + { url = "https://files.pythonhosted.org/packages/2c/ea/5f76bce7cf6fcd0ab1a1058b5af899bfbef198bea4d5686da88471ea0336/cffi-2.0.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:7a66c7204d8869299919db4d5069a82f1561581af12b11b3c9f48c584eb8743d", size = 185013, upload-time = "2025-09-08T23:23:30.63Z" }, + { url = "https://files.pythonhosted.org/packages/be/b4/c56878d0d1755cf9caa54ba71e5d049479c52f9e4afc230f06822162ab2f/cffi-2.0.0-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:7cc09976e8b56f8cebd752f7113ad07752461f48a58cbba644139015ac24954c", size = 221593, upload-time = "2025-09-08T23:23:31.91Z" }, + { url = "https://files.pythonhosted.org/packages/e0/0d/eb704606dfe8033e7128df5e90fee946bbcb64a04fcdaa97321309004000/cffi-2.0.0-cp314-cp314t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.whl", hash = "sha256:92b68146a71df78564e4ef48af17551a5ddd142e5190cdf2c5624d0c3ff5b2e8", size = 209354, upload-time = "2025-09-08T23:23:33.214Z" }, + { url = "https://files.pythonhosted.org/packages/d8/19/3c435d727b368ca475fb8742ab97c9cb13a0de600ce86f62eab7fa3eea60/cffi-2.0.0-cp314-cp314t-manylinux2014_s390x.manylinux_2_17_s390x.whl", hash = "sha256:b1e74d11748e7e98e2f426ab176d4ed720a64412b6a15054378afdb71e0f37dc", size = 208480, upload-time = "2025-09-08T23:23:34.495Z" }, + { url = "https://files.pythonhosted.org/packages/d0/44/681604464ed9541673e486521497406fadcc15b5217c3e326b061696899a/cffi-2.0.0-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:28a3a209b96630bca57cce802da70c266eb08c6e97e5afd61a75611ee6c64592", size = 221584, upload-time = "2025-09-08T23:23:36.096Z" }, + { url = "https://files.pythonhosted.org/packages/25/8e/342a504ff018a2825d395d44d63a767dd8ebc927ebda557fecdaca3ac33a/cffi-2.0.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:7553fb2090d71822f02c629afe6042c299edf91ba1bf94951165613553984512", size = 224443, upload-time = "2025-09-08T23:23:37.328Z" }, + { url = "https://files.pythonhosted.org/packages/e1/5e/b666bacbbc60fbf415ba9988324a132c9a7a0448a9a8f125074671c0f2c3/cffi-2.0.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:6c6c373cfc5c83a975506110d17457138c8c63016b563cc9ed6e056a82f13ce4", size = 223437, upload-time = "2025-09-08T23:23:38.945Z" }, + { url = "https://files.pythonhosted.org/packages/a0/1d/ec1a60bd1a10daa292d3cd6bb0b359a81607154fb8165f3ec95fe003b85c/cffi-2.0.0-cp314-cp314t-win32.whl", hash = "sha256:1fc9ea04857caf665289b7a75923f2c6ed559b8298a1b8c49e59f7dd95c8481e", size = 180487, upload-time = "2025-09-08T23:23:40.423Z" }, + { url = "https://files.pythonhosted.org/packages/bf/41/4c1168c74fac325c0c8156f04b6749c8b6a8f405bbf91413ba088359f60d/cffi-2.0.0-cp314-cp314t-win_amd64.whl", hash = "sha256:d68b6cef7827e8641e8ef16f4494edda8b36104d79773a334beaa1e3521430f6", size = 191726, upload-time = "2025-09-08T23:23:41.742Z" }, + { url = "https://files.pythonhosted.org/packages/ae/3a/dbeec9d1ee0844c679f6bb5d6ad4e9f198b1224f4e7a32825f47f6192b0c/cffi-2.0.0-cp314-cp314t-win_arm64.whl", hash = "sha256:0a1527a803f0a659de1af2e1fd700213caba79377e27e4693648c2923da066f9", size = 184195, upload-time = "2025-09-08T23:23:43.004Z" }, +] + [[package]] name = "cfgv" version = "3.5.0" @@ -360,6 +409,59 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/0d/4a/331fe2caf6799d591109bb9c08083080f6de90a823695d412a935622abb2/coverage-7.13.4-py3-none-any.whl", hash = "sha256:1af1641e57cf7ba1bd67d677c9abdbcd6cc2ab7da3bca7fa1e2b7e50e65f2ad0", size = 211242, upload-time = "2026-02-09T12:59:02.032Z" }, ] +[[package]] +name = "cryptography" +version = "46.0.5" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "cffi", marker = "platform_python_implementation != 'PyPy'" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/60/04/ee2a9e8542e4fa2773b81771ff8349ff19cdd56b7258a0cc442639052edb/cryptography-46.0.5.tar.gz", hash = "sha256:abace499247268e3757271b2f1e244b36b06f8515cf27c4d49468fc9eb16e93d", size = 750064, upload-time = "2026-02-10T19:18:38.255Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/f7/81/b0bb27f2ba931a65409c6b8a8b358a7f03c0e46eceacddff55f7c84b1f3b/cryptography-46.0.5-cp311-abi3-macosx_10_9_universal2.whl", hash = "sha256:351695ada9ea9618b3500b490ad54c739860883df6c1f555e088eaf25b1bbaad", size = 7176289, upload-time = "2026-02-10T19:17:08.274Z" }, + { url = "https://files.pythonhosted.org/packages/ff/9e/6b4397a3e3d15123de3b1806ef342522393d50736c13b20ec4c9ea6693a6/cryptography-46.0.5-cp311-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:c18ff11e86df2e28854939acde2d003f7984f721eba450b56a200ad90eeb0e6b", size = 4275637, upload-time = "2026-02-10T19:17:10.53Z" }, + { url = "https://files.pythonhosted.org/packages/63/e7/471ab61099a3920b0c77852ea3f0ea611c9702f651600397ac567848b897/cryptography-46.0.5-cp311-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:4d7e3d356b8cd4ea5aff04f129d5f66ebdc7b6f8eae802b93739ed520c47c79b", size = 4424742, upload-time = "2026-02-10T19:17:12.388Z" }, + { url = "https://files.pythonhosted.org/packages/37/53/a18500f270342d66bf7e4d9f091114e31e5ee9e7375a5aba2e85a91e0044/cryptography-46.0.5-cp311-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:50bfb6925eff619c9c023b967d5b77a54e04256c4281b0e21336a130cd7fc263", size = 4277528, upload-time = "2026-02-10T19:17:13.853Z" }, + { url = "https://files.pythonhosted.org/packages/22/29/c2e812ebc38c57b40e7c583895e73c8c5adb4d1e4a0cc4c5a4fdab2b1acc/cryptography-46.0.5-cp311-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:803812e111e75d1aa73690d2facc295eaefd4439be1023fefc4995eaea2af90d", size = 4947993, upload-time = "2026-02-10T19:17:15.618Z" }, + { url = "https://files.pythonhosted.org/packages/6b/e7/237155ae19a9023de7e30ec64e5d99a9431a567407ac21170a046d22a5a3/cryptography-46.0.5-cp311-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:3ee190460e2fbe447175cda91b88b84ae8322a104fc27766ad09428754a618ed", size = 4456855, upload-time = "2026-02-10T19:17:17.221Z" }, + { url = "https://files.pythonhosted.org/packages/2d/87/fc628a7ad85b81206738abbd213b07702bcbdada1dd43f72236ef3cffbb5/cryptography-46.0.5-cp311-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:f145bba11b878005c496e93e257c1e88f154d278d2638e6450d17e0f31e558d2", size = 3984635, upload-time = "2026-02-10T19:17:18.792Z" }, + { url = "https://files.pythonhosted.org/packages/84/29/65b55622bde135aedf4565dc509d99b560ee4095e56989e815f8fd2aa910/cryptography-46.0.5-cp311-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:e9251e3be159d1020c4030bd2e5f84d6a43fe54b6c19c12f51cde9542a2817b2", size = 4277038, upload-time = "2026-02-10T19:17:20.256Z" }, + { url = "https://files.pythonhosted.org/packages/bc/36/45e76c68d7311432741faf1fbf7fac8a196a0a735ca21f504c75d37e2558/cryptography-46.0.5-cp311-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:47fb8a66058b80e509c47118ef8a75d14c455e81ac369050f20ba0d23e77fee0", size = 4912181, upload-time = "2026-02-10T19:17:21.825Z" }, + { url = "https://files.pythonhosted.org/packages/6d/1a/c1ba8fead184d6e3d5afcf03d569acac5ad063f3ac9fb7258af158f7e378/cryptography-46.0.5-cp311-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:4c3341037c136030cb46e4b1e17b7418ea4cbd9dd207e4a6f3b2b24e0d4ac731", size = 4456482, upload-time = "2026-02-10T19:17:25.133Z" }, + { url = "https://files.pythonhosted.org/packages/f9/e5/3fb22e37f66827ced3b902cf895e6a6bc1d095b5b26be26bd13c441fdf19/cryptography-46.0.5-cp311-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:890bcb4abd5a2d3f852196437129eb3667d62630333aacc13dfd470fad3aaa82", size = 4405497, upload-time = "2026-02-10T19:17:26.66Z" }, + { url = "https://files.pythonhosted.org/packages/1a/df/9d58bb32b1121a8a2f27383fabae4d63080c7ca60b9b5c88be742be04ee7/cryptography-46.0.5-cp311-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:80a8d7bfdf38f87ca30a5391c0c9ce4ed2926918e017c29ddf643d0ed2778ea1", size = 4667819, upload-time = "2026-02-10T19:17:28.569Z" }, + { url = "https://files.pythonhosted.org/packages/ea/ed/325d2a490c5e94038cdb0117da9397ece1f11201f425c4e9c57fe5b9f08b/cryptography-46.0.5-cp311-abi3-win32.whl", hash = "sha256:60ee7e19e95104d4c03871d7d7dfb3d22ef8a9b9c6778c94e1c8fcc8365afd48", size = 3028230, upload-time = "2026-02-10T19:17:30.518Z" }, + { url = "https://files.pythonhosted.org/packages/e9/5a/ac0f49e48063ab4255d9e3b79f5def51697fce1a95ea1370f03dc9db76f6/cryptography-46.0.5-cp311-abi3-win_amd64.whl", hash = "sha256:38946c54b16c885c72c4f59846be9743d699eee2b69b6988e0a00a01f46a61a4", size = 3480909, upload-time = "2026-02-10T19:17:32.083Z" }, + { url = "https://files.pythonhosted.org/packages/00/13/3d278bfa7a15a96b9dc22db5a12ad1e48a9eb3d40e1827ef66a5df75d0d0/cryptography-46.0.5-cp314-cp314t-macosx_10_9_universal2.whl", hash = "sha256:94a76daa32eb78d61339aff7952ea819b1734b46f73646a07decb40e5b3448e2", size = 7119287, upload-time = "2026-02-10T19:17:33.801Z" }, + { url = "https://files.pythonhosted.org/packages/67/c8/581a6702e14f0898a0848105cbefd20c058099e2c2d22ef4e476dfec75d7/cryptography-46.0.5-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:5be7bf2fb40769e05739dd0046e7b26f9d4670badc7b032d6ce4db64dddc0678", size = 4265728, upload-time = "2026-02-10T19:17:35.569Z" }, + { url = "https://files.pythonhosted.org/packages/dd/4a/ba1a65ce8fc65435e5a849558379896c957870dd64fecea97b1ad5f46a37/cryptography-46.0.5-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:fe346b143ff9685e40192a4960938545c699054ba11d4f9029f94751e3f71d87", size = 4408287, upload-time = "2026-02-10T19:17:36.938Z" }, + { url = "https://files.pythonhosted.org/packages/f8/67/8ffdbf7b65ed1ac224d1c2df3943553766914a8ca718747ee3871da6107e/cryptography-46.0.5-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:c69fd885df7d089548a42d5ec05be26050ebcd2283d89b3d30676eb32ff87dee", size = 4270291, upload-time = "2026-02-10T19:17:38.748Z" }, + { url = "https://files.pythonhosted.org/packages/f8/e5/f52377ee93bc2f2bba55a41a886fd208c15276ffbd2569f2ddc89d50e2c5/cryptography-46.0.5-cp314-cp314t-manylinux_2_28_ppc64le.whl", hash = "sha256:8293f3dea7fc929ef7240796ba231413afa7b68ce38fd21da2995549f5961981", size = 4927539, upload-time = "2026-02-10T19:17:40.241Z" }, + { url = "https://files.pythonhosted.org/packages/3b/02/cfe39181b02419bbbbcf3abdd16c1c5c8541f03ca8bda240debc467d5a12/cryptography-46.0.5-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:1abfdb89b41c3be0365328a410baa9df3ff8a9110fb75e7b52e66803ddabc9a9", size = 4442199, upload-time = "2026-02-10T19:17:41.789Z" }, + { url = "https://files.pythonhosted.org/packages/c0/96/2fcaeb4873e536cf71421a388a6c11b5bc846e986b2b069c79363dc1648e/cryptography-46.0.5-cp314-cp314t-manylinux_2_31_armv7l.whl", hash = "sha256:d66e421495fdb797610a08f43b05269e0a5ea7f5e652a89bfd5a7d3c1dee3648", size = 3960131, upload-time = "2026-02-10T19:17:43.379Z" }, + { url = "https://files.pythonhosted.org/packages/d8/d2/b27631f401ddd644e94c5cf33c9a4069f72011821cf3dc7309546b0642a0/cryptography-46.0.5-cp314-cp314t-manylinux_2_34_aarch64.whl", hash = "sha256:4e817a8920bfbcff8940ecfd60f23d01836408242b30f1a708d93198393a80b4", size = 4270072, upload-time = "2026-02-10T19:17:45.481Z" }, + { url = "https://files.pythonhosted.org/packages/f4/a7/60d32b0370dae0b4ebe55ffa10e8599a2a59935b5ece1b9f06edb73abdeb/cryptography-46.0.5-cp314-cp314t-manylinux_2_34_ppc64le.whl", hash = "sha256:68f68d13f2e1cb95163fa3b4db4bf9a159a418f5f6e7242564fc75fcae667fd0", size = 4892170, upload-time = "2026-02-10T19:17:46.997Z" }, + { url = "https://files.pythonhosted.org/packages/d2/b9/cf73ddf8ef1164330eb0b199a589103c363afa0cf794218c24d524a58eab/cryptography-46.0.5-cp314-cp314t-manylinux_2_34_x86_64.whl", hash = "sha256:a3d1fae9863299076f05cb8a778c467578262fae09f9dc0ee9b12eb4268ce663", size = 4441741, upload-time = "2026-02-10T19:17:48.661Z" }, + { url = "https://files.pythonhosted.org/packages/5f/eb/eee00b28c84c726fe8fa0158c65afe312d9c3b78d9d01daf700f1f6e37ff/cryptography-46.0.5-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:c4143987a42a2397f2fc3b4d7e3a7d313fbe684f67ff443999e803dd75a76826", size = 4396728, upload-time = "2026-02-10T19:17:50.058Z" }, + { url = "https://files.pythonhosted.org/packages/65/f4/6bc1a9ed5aef7145045114b75b77c2a8261b4d38717bd8dea111a63c3442/cryptography-46.0.5-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:7d731d4b107030987fd61a7f8ab512b25b53cef8f233a97379ede116f30eb67d", size = 4652001, upload-time = "2026-02-10T19:17:51.54Z" }, + { url = "https://files.pythonhosted.org/packages/86/ef/5d00ef966ddd71ac2e6951d278884a84a40ffbd88948ef0e294b214ae9e4/cryptography-46.0.5-cp314-cp314t-win32.whl", hash = "sha256:c3bcce8521d785d510b2aad26ae2c966092b7daa8f45dd8f44734a104dc0bc1a", size = 3003637, upload-time = "2026-02-10T19:17:52.997Z" }, + { url = "https://files.pythonhosted.org/packages/b7/57/f3f4160123da6d098db78350fdfd9705057aad21de7388eacb2401dceab9/cryptography-46.0.5-cp314-cp314t-win_amd64.whl", hash = "sha256:4d8ae8659ab18c65ced284993c2265910f6c9e650189d4e3f68445ef82a810e4", size = 3469487, upload-time = "2026-02-10T19:17:54.549Z" }, + { url = "https://files.pythonhosted.org/packages/e2/fa/a66aa722105ad6a458bebd64086ca2b72cdd361fed31763d20390f6f1389/cryptography-46.0.5-cp38-abi3-macosx_10_9_universal2.whl", hash = "sha256:4108d4c09fbbf2789d0c926eb4152ae1760d5a2d97612b92d508d96c861e4d31", size = 7170514, upload-time = "2026-02-10T19:17:56.267Z" }, + { url = "https://files.pythonhosted.org/packages/0f/04/c85bdeab78c8bc77b701bf0d9bdcf514c044e18a46dcff330df5448631b0/cryptography-46.0.5-cp38-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:7d1f30a86d2757199cb2d56e48cce14deddf1f9c95f1ef1b64ee91ea43fe2e18", size = 4275349, upload-time = "2026-02-10T19:17:58.419Z" }, + { url = "https://files.pythonhosted.org/packages/5c/32/9b87132a2f91ee7f5223b091dc963055503e9b442c98fc0b8a5ca765fab0/cryptography-46.0.5-cp38-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:039917b0dc418bb9f6edce8a906572d69e74bd330b0b3fea4f79dab7f8ddd235", size = 4420667, upload-time = "2026-02-10T19:18:00.619Z" }, + { url = "https://files.pythonhosted.org/packages/a1/a6/a7cb7010bec4b7c5692ca6f024150371b295ee1c108bdc1c400e4c44562b/cryptography-46.0.5-cp38-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:ba2a27ff02f48193fc4daeadf8ad2590516fa3d0adeeb34336b96f7fa64c1e3a", size = 4276980, upload-time = "2026-02-10T19:18:02.379Z" }, + { url = "https://files.pythonhosted.org/packages/8e/7c/c4f45e0eeff9b91e3f12dbd0e165fcf2a38847288fcfd889deea99fb7b6d/cryptography-46.0.5-cp38-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:61aa400dce22cb001a98014f647dc21cda08f7915ceb95df0c9eaf84b4b6af76", size = 4939143, upload-time = "2026-02-10T19:18:03.964Z" }, + { url = "https://files.pythonhosted.org/packages/37/19/e1b8f964a834eddb44fa1b9a9976f4e414cbb7aa62809b6760c8803d22d1/cryptography-46.0.5-cp38-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:3ce58ba46e1bc2aac4f7d9290223cead56743fa6ab94a5d53292ffaac6a91614", size = 4453674, upload-time = "2026-02-10T19:18:05.588Z" }, + { url = "https://files.pythonhosted.org/packages/db/ed/db15d3956f65264ca204625597c410d420e26530c4e2943e05a0d2f24d51/cryptography-46.0.5-cp38-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:420d0e909050490d04359e7fdb5ed7e667ca5c3c402b809ae2563d7e66a92229", size = 3978801, upload-time = "2026-02-10T19:18:07.167Z" }, + { url = "https://files.pythonhosted.org/packages/41/e2/df40a31d82df0a70a0daf69791f91dbb70e47644c58581d654879b382d11/cryptography-46.0.5-cp38-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:582f5fcd2afa31622f317f80426a027f30dc792e9c80ffee87b993200ea115f1", size = 4276755, upload-time = "2026-02-10T19:18:09.813Z" }, + { url = "https://files.pythonhosted.org/packages/33/45/726809d1176959f4a896b86907b98ff4391a8aa29c0aaaf9450a8a10630e/cryptography-46.0.5-cp38-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:bfd56bb4b37ed4f330b82402f6f435845a5f5648edf1ad497da51a8452d5d62d", size = 4901539, upload-time = "2026-02-10T19:18:11.263Z" }, + { url = "https://files.pythonhosted.org/packages/99/0f/a3076874e9c88ecb2ecc31382f6e7c21b428ede6f55aafa1aa272613e3cd/cryptography-46.0.5-cp38-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:a3d507bb6a513ca96ba84443226af944b0f7f47dcc9a399d110cd6146481d24c", size = 4452794, upload-time = "2026-02-10T19:18:12.914Z" }, + { url = "https://files.pythonhosted.org/packages/02/ef/ffeb542d3683d24194a38f66ca17c0a4b8bf10631feef44a7ef64e631b1a/cryptography-46.0.5-cp38-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:9f16fbdf4da055efb21c22d81b89f155f02ba420558db21288b3d0035bafd5f4", size = 4404160, upload-time = "2026-02-10T19:18:14.375Z" }, + { url = "https://files.pythonhosted.org/packages/96/93/682d2b43c1d5f1406ed048f377c0fc9fc8f7b0447a478d5c65ab3d3a66eb/cryptography-46.0.5-cp38-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:ced80795227d70549a411a4ab66e8ce307899fad2220ce5ab2f296e687eacde9", size = 4667123, upload-time = "2026-02-10T19:18:15.886Z" }, + { url = "https://files.pythonhosted.org/packages/45/2d/9c5f2926cb5300a8eefc3f4f0b3f3df39db7f7ce40c8365444c49363cbda/cryptography-46.0.5-cp38-abi3-win32.whl", hash = "sha256:02f547fce831f5096c9a567fd41bc12ca8f11df260959ecc7c3202555cc47a72", size = 3010220, upload-time = "2026-02-10T19:18:17.361Z" }, + { url = "https://files.pythonhosted.org/packages/48/ef/0c2f4a8e31018a986949d34a01115dd057bf536905dca38897bacd21fac3/cryptography-46.0.5-cp38-abi3-win_amd64.whl", hash = "sha256:556e106ee01aa13484ce9b0239bca667be5004efb0aabbed28d353df86445595", size = 3467050, upload-time = "2026-02-10T19:18:18.899Z" }, +] + [[package]] name = "decli" version = "0.6.3" @@ -605,6 +707,15 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/2a/39/e50c7c3a983047577ee07d2a9e53faf5a69493943ec3f6a384bdc792deb2/httpx-0.28.1-py3-none-any.whl", hash = "sha256:d909fcccc110f8c7faf814ca82a9a4d816bc5a6dbfea25d6591d6985b8ba59ad", size = 73517, upload-time = "2024-12-06T15:37:21.509Z" }, ] +[[package]] +name = "httpx-sse" +version = "0.4.3" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/0f/4c/751061ffa58615a32c31b2d82e8482be8dd4a89154f003147acee90f2be9/httpx_sse-0.4.3.tar.gz", hash = "sha256:9b1ed0127459a66014aec3c56bebd93da3c1bc8bb6618c8082039a44889a755d", size = 15943, upload-time = "2025-10-10T21:48:22.271Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/d2/fd/6668e5aec43ab844de6fc74927e155a3b37bf40d7c3790e49fc0406b6578/httpx_sse-0.4.3-py3-none-any.whl", hash = "sha256:0ac1c9fe3c0afad2e0ebb25a934a59f4c7823b60792691f779fad2c5568830fc", size = 8960, upload-time = "2025-10-10T21:48:21.158Z" }, +] + [[package]] name = "huggingface-hub" version = "1.5.0" @@ -902,6 +1013,31 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/70/bc/6f1c2f612465f5fa89b95bead1f44dcb607670fd42891d8fdcd5d039f4f4/markupsafe-3.0.3-cp314-cp314t-win_arm64.whl", hash = "sha256:32001d6a8fc98c8cb5c947787c5d08b0a50663d139f1305bac5885d98d9b40fa", size = 14146, upload-time = "2025-09-27T18:37:28.327Z" }, ] +[[package]] +name = "mcp" +version = "1.26.0" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "anyio" }, + { name = "httpx" }, + { name = "httpx-sse" }, + { name = "jsonschema" }, + { name = "pydantic" }, + { name = "pydantic-settings" }, + { name = "pyjwt", extra = ["crypto"] }, + { name = "python-multipart" }, + { name = "pywin32", marker = "sys_platform == 'win32'" }, + { name = "sse-starlette" }, + { name = "starlette" }, + { name = "typing-extensions" }, + { name = "typing-inspection" }, + { name = "uvicorn", marker = "sys_platform != 'emscripten'" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/fc/6d/62e76bbb8144d6ed86e202b5edd8a4cb631e7c8130f3f4893c3f90262b10/mcp-1.26.0.tar.gz", hash = "sha256:db6e2ef491eecc1a0d93711a76f28dec2e05999f93afd48795da1c1137142c66", size = 608005, upload-time = "2026-01-24T19:40:32.468Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/fd/d9/eaa1f80170d2b7c5ba23f3b59f766f3a0bb41155fbc32a69adfa1adaaef9/mcp-1.26.0-py3-none-any.whl", hash = "sha256:904a21c33c25aa98ddbeb47273033c435e595bbacfdb177f4bd87f6dceebe1ca", size = 233615, upload-time = "2026-01-24T19:40:30.652Z" }, +] + [[package]] name = "mdurl" version = "0.1.2" @@ -1185,6 +1321,15 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/5b/5a/bc7b4a4ef808fa59a816c17b20c4bef6884daebbdf627ff2a161da67da19/propcache-0.4.1-py3-none-any.whl", hash = "sha256:af2a6052aeb6cf17d3e46ee169099044fd8224cbaf75c76a2ef596e8163e2237", size = 13305, upload-time = "2025-10-08T19:49:00.792Z" }, ] +[[package]] +name = "pycparser" +version = "3.0" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/1b/7d/92392ff7815c21062bea51aa7b87d45576f649f16458d78b7cf94b9ab2e6/pycparser-3.0.tar.gz", hash = "sha256:600f49d217304a5902ac3c37e1281c9fe94e4d0489de643a9504c5cdfdfc6b29", size = 103492, upload-time = "2026-01-21T14:26:51.89Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/0c/c3/44f3fbbfa403ea2a7c779186dc20772604442dde72947e7d01069cbe98e3/pycparser-3.0-py3-none-any.whl", hash = "sha256:b727414169a36b7d524c1c3e31839a521725078d7b2ff038656844266160a992", size = 48172, upload-time = "2026-01-21T14:26:50.693Z" }, +] + [[package]] name = "pydantic" version = "2.12.5" @@ -1252,6 +1397,20 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/fe/17/fabd56da47096d240dd45ba627bead0333b0cf0ee8ada9bec579287dadf3/pydantic_extra_types-2.11.0-py3-none-any.whl", hash = "sha256:84b864d250a0fc62535b7ec591e36f2c5b4d1325fa0017eb8cda9aeb63b374a6", size = 74296, upload-time = "2025-12-31T16:18:26.38Z" }, ] +[[package]] +name = "pydantic-settings" +version = "2.13.1" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "pydantic" }, + { name = "python-dotenv" }, + { name = "typing-inspection" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/52/6d/fffca34caecc4a3f97bda81b2098da5e8ab7efc9a66e819074a11955d87e/pydantic_settings-2.13.1.tar.gz", hash = "sha256:b4c11847b15237fb0171e1462bf540e294affb9b86db4d9aa5c01730bdbe4025", size = 223826, upload-time = "2026-02-19T13:45:08.055Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/00/4b/ccc026168948fec4f7555b9164c724cf4125eac006e176541483d2c959be/pydantic_settings-2.13.1-py3-none-any.whl", hash = "sha256:d56fd801823dbeae7f0975e1f8c8e25c258eb75d278ea7abb5d9cebb01b56237", size = 58929, upload-time = "2026-02-19T13:45:06.034Z" }, +] + [[package]] name = "pygments" version = "2.19.2" @@ -1261,6 +1420,20 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/c7/21/705964c7812476f378728bdf590ca4b771ec72385c533964653c68e86bdc/pygments-2.19.2-py3-none-any.whl", hash = "sha256:86540386c03d588bb81d44bc3928634ff26449851e99741617ecb9037ee5ec0b", size = 1225217, upload-time = "2025-06-21T13:39:07.939Z" }, ] +[[package]] +name = "pyjwt" +version = "2.11.0" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/5c/5a/b46fa56bf322901eee5b0454a34343cdbdae202cd421775a8ee4e42fd519/pyjwt-2.11.0.tar.gz", hash = "sha256:35f95c1f0fbe5d5ba6e43f00271c275f7a1a4db1dab27bf708073b75318ea623", size = 98019, upload-time = "2026-01-30T19:59:55.694Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/6f/01/c26ce75ba460d5cd503da9e13b21a33804d38c2165dec7b716d06b13010c/pyjwt-2.11.0-py3-none-any.whl", hash = "sha256:94a6bde30eb5c8e04fee991062b534071fd1439ef58d2adc9ccb823e7bcd0469", size = 28224, upload-time = "2026-01-30T19:59:54.539Z" }, +] + +[package.optional-dependencies] +crypto = [ + { name = "cryptography" }, +] + [[package]] name = "pytest" version = "9.0.2" @@ -1362,6 +1535,25 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/14/1b/a298b06749107c305e1fe0f814c6c74aea7b2f1e10989cb30f544a1b3253/python_dotenv-1.2.1-py3-none-any.whl", hash = "sha256:b81ee9561e9ca4004139c6cbba3a238c32b03e4894671e181b671e8cb8425d61", size = 21230, upload-time = "2025-10-26T15:12:09.109Z" }, ] +[[package]] +name = "python-multipart" +version = "0.0.22" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/94/01/979e98d542a70714b0cb2b6728ed0b7c46792b695e3eaec3e20711271ca3/python_multipart-0.0.22.tar.gz", hash = "sha256:7340bef99a7e0032613f56dc36027b959fd3b30a787ed62d310e951f7c3a3a58", size = 37612, upload-time = "2026-01-25T10:15:56.219Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/1b/d0/397f9626e711ff749a95d96b7af99b9c566a9bb5129b8e4c10fc4d100304/python_multipart-0.0.22-py3-none-any.whl", hash = "sha256:2b2cd894c83d21bf49d702499531c7bafd057d730c201782048f7945d82de155", size = 24579, upload-time = "2026-01-25T10:15:54.811Z" }, +] + +[[package]] +name = "pywin32" +version = "311" +source = { registry = "https://pypi.org/simple" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/c9/31/097f2e132c4f16d99a22bfb777e0fd88bd8e1c634304e102f313af69ace5/pywin32-311-cp314-cp314-win32.whl", hash = "sha256:b7a2c10b93f8986666d0c803ee19b5990885872a7de910fc460f9b0c2fbf92ee", size = 8840714, upload-time = "2025-07-14T20:13:32.449Z" }, + { url = "https://files.pythonhosted.org/packages/90/4b/07c77d8ba0e01349358082713400435347df8426208171ce297da32c313d/pywin32-311-cp314-cp314-win_amd64.whl", hash = "sha256:3aca44c046bd2ed8c90de9cb8427f581c479e594e99b5c0bb19b29c10fd6cb87", size = 9656800, upload-time = "2025-07-14T20:13:34.312Z" }, + { url = "https://files.pythonhosted.org/packages/c0/d2/21af5c535501a7233e734b8af901574572da66fcc254cb35d0609c9080dd/pywin32-311-cp314-cp314-win_arm64.whl", hash = "sha256:a508e2d9025764a8270f93111a970e1d0fbfc33f4153b388bb649b7eec4f9b42", size = 8932540, upload-time = "2025-07-14T20:13:36.379Z" }, +] + [[package]] name = "pyyaml" version = "6.0.3" @@ -1596,6 +1788,31 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/e9/44/75a9c9421471a6c4805dbf2356f7c181a29c1879239abab1ea2cc8f38b40/sniffio-1.3.1-py3-none-any.whl", hash = "sha256:2f6da418d1f1e0fddd844478f41680e794e6051915791a034ff65e5f100525a2", size = 10235, upload-time = "2024-02-25T23:20:01.196Z" }, ] +[[package]] +name = "sse-starlette" +version = "3.3.2" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "anyio" }, + { name = "starlette" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/5a/9f/c3695c2d2d4ef70072c3a06992850498b01c6bc9be531950813716b426fa/sse_starlette-3.3.2.tar.gz", hash = "sha256:678fca55a1945c734d8472a6cad186a55ab02840b4f6786f5ee8770970579dcd", size = 32326, upload-time = "2026-02-28T11:24:34.36Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/61/28/8cb142d3fe80c4a2d8af54ca0b003f47ce0ba920974e7990fa6e016402d1/sse_starlette-3.3.2-py3-none-any.whl", hash = "sha256:5c3ea3dad425c601236726af2f27689b74494643f57017cafcb6f8c9acfbb862", size = 14270, upload-time = "2026-02-28T11:24:32.984Z" }, +] + +[[package]] +name = "starlette" +version = "0.52.1" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "anyio" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/c4/68/79977123bb7be889ad680d79a40f339082c1978b5cfcf62c2d8d196873ac/starlette-0.52.1.tar.gz", hash = "sha256:834edd1b0a23167694292e94f597773bc3f89f362be6effee198165a35d62933", size = 2653702, upload-time = "2026-01-18T13:34:11.062Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/81/0d/13d1d239a25cbfb19e740db83143e95c772a1fe10202dda4b76792b114dd/starlette-0.52.1-py3-none-any.whl", hash = "sha256:0029d43eb3d273bc4f83a08720b4912ea4b071087a3b48db01b7c839f7954d74", size = 74272, upload-time = "2026-01-18T13:34:09.188Z" }, +] + [[package]] name = "structlog" version = "25.5.0"