diff --git a/CLAUDE.md b/CLAUDE.md
index bc346a3c8c..77aa783ee5 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -66,7 +66,7 @@ src/ai_company/
 - **PEP 758 except syntax**: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14
 - **Type hints**: all public functions, mypy strict mode
 - **Docstrings**: Google style, required on public classes/functions (enforced by ruff D rules)
-- **Immutability**: create new objects, never mutate existing ones. For `dict`/`list` fields in frozen Pydantic models, use `MappingProxyType` wrapping at construction (not `deepcopy` on access). Deep-copy only at system boundaries (e.g. passing data to `tool.execute()`, serializing for persistence).
+- **Immutability**: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction + `MappingProxyType` wrapping for read-only enforcement. For `dict`/`list` fields in frozen Pydantic models, rely on `frozen=True` for field reassignment prevention and `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
 - **Config vs runtime state**: frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state that evolves (e.g. agent execution state, task progress). Never mix static config fields with mutable runtime fields in one model.
 - **Models**: Pydantic v2 (`BaseModel`, `model_validator`, `ConfigDict`). Planned conventions for new code: use `@computed_field` for derived values instead of storing + validating redundant fields; use `NotBlankStr` (from `core.types`) for non-optional identifier/name fields instead of manual whitespace validators. Existing models are being migrated incrementally.
 - **Async concurrency**: prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.
diff --git a/DESIGN_SPEC.md b/DESIGN_SPEC.md
index 7b0dc7f0da..2e050fcf1e 100644
--- a/DESIGN_SPEC.md
+++ b/DESIGN_SPEC.md
@@ -915,7 +915,7 @@ budget:
 
 When the LLM requests multiple tool calls in a single turn, `ToolInvoker.invoke_all` currently executes them **sequentially**. Migration to `asyncio.TaskGroup` for parallel structured concurrency is planned (see §15.5). Recoverable errors are captured as `ToolResult(is_error=True)` without aborting remaining invocations; non-recoverable errors (`MemoryError`, `RecursionError`) propagate immediately and abort the sequence.
 
-Tool parameter schemas (`parameters_schema`) are currently exposed via `deepcopy` on each property access (construction also deep-copies). `MappingProxyType` wrapping is used in the `ToolRegistry` for its internal collections. Migrating `BaseTool.parameters_schema` to `MappingProxyType` at construction (removing per-access `deepcopy`) is a planned convention (see §15.5).
+`BaseTool.parameters_schema` deep-copies the caller-supplied schema at construction and wraps it in `MappingProxyType` for read-only enforcement; the property returns a deep copy on access to prevent mutation of internal state. `ToolInvoker` deep-copies arguments at the tool execution boundary before passing them to `tool.execute()`. `MappingProxyType` wrapping is also used in `ToolRegistry` for its internal collections.
 
 ### 11.2 Tool Access Levels
 
@@ -1302,11 +1302,11 @@ ai-company/
 │       │   ├── role.py             # Role model
 │       │   └── role_catalog.py     # Role catalog
 │       ├── engine/                  # Core engines (M3+)
-│       │   ├── errors.py           # Engine error hierarchy (M3)
-│       │   ├── prompt.py           # System prompt builder (M3)
-│       │   ├── prompt_template.py  # System prompt Jinja2 templates (M3)
-│       │   ├── task_execution.py   # TaskExecution + StatusTransition (M3)
-│       │   ├── context.py          # AgentContext + AgentContextSnapshot (M3)
+│       │   ├── errors.py           # Engine error hierarchy
+│       │   ├── prompt.py           # System prompt builder
+│       │   ├── prompt_template.py  # System prompt Jinja2 templates
+│       │   ├── task_execution.py   # TaskExecution + StatusTransition
+│       │   ├── context.py          # AgentContext + AgentContextSnapshot
 │       │   ├── agent_engine.py     # Agent execution loop (M3)
 │       │   ├── task_engine.py      # Task routing & scheduling (M3-M4)
 │       │   ├── workflow_engine.py  # Workflow orchestration (M4)
@@ -1436,7 +1436,7 @@ These conventions were established during the M0–M2 review cycle. **Adopted**
 
 | Convention | Status | Decision | Rationale |
 |------------|--------|----------|-----------|
-| **Immutability strategy** | Adopted | `MappingProxyType` at construction for dict fields in registries and collections; `frozen=True` on all config/identity models | MappingProxyType is O(1) and prevents accidental mutation. Pydantic `frozen=True` is confirmed shallow (pydantic#7784). |
+| **Immutability strategy** | Adopted | `copy.deepcopy()` at construction + `MappingProxyType` wrapping for non-Pydantic internal collections (registries, `BaseTool`). For Pydantic frozen models: `frozen=True` prevents field reassignment; `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization) prevents nested mutation. No MappingProxyType inside Pydantic models (serialization friction). | Deep-copy at construction fully isolates nested structures; `MappingProxyType` enforces read-only access. Boundary-copy for Pydantic models is simple, centralized, and Pydantic-native. A future CPython built-in immutable mapping type (e.g. `frozendict`) would provide zero-friction field-level immutability when available. |
 | **Config vs runtime split** | Adopted (M3) | Frozen models for config/identity; `model_copy(update=...)` for runtime state transitions | `TaskExecution` and `AgentContext` (in `engine/`) are frozen Pydantic models that use `model_copy(update=...)` for copy-on-write state transitions without re-running validators (per Pydantic `model_copy` semantics). Config layer (`AgentIdentity`, `Task`) remains unchanged. |
 | **Derived fields** | Planned | `@computed_field` instead of stored + validated | Eliminates redundant storage and impossible-to-fail validators (e.g. `total_tokens = input + output`). Currently `total_tokens` uses stored `Field` + `@model_validator`. |
 | **String validation** | Planned | `NotBlankStr` type from `core.types` for all identifiers | Eliminates per-model `@model_validator` boilerplate for whitespace checks. `NotBlankStr` is defined but models still use `Field(min_length=1)` + manual validators. |
diff --git a/src/ai_company/engine/prompt.py b/src/ai_company/engine/prompt.py
index 664c4c682e..94886615ef 100644
--- a/src/ai_company/engine/prompt.py
+++ b/src/ai_company/engine/prompt.py
@@ -77,7 +77,7 @@ class SystemPrompt(BaseModel):
         description="Names of sections included in the prompt",
     )
     metadata: dict[str, str] = Field(
-        description="Agent identity metadata (treat as read-only)",
+        description="Agent identity metadata (string-only values; shallow-frozen)",
     )
 
 
diff --git a/src/ai_company/observability/events.py b/src/ai_company/observability/events.py
index 6489b6d9ba..b71ed87fba 100644
--- a/src/ai_company/observability/events.py
+++ b/src/ai_company/observability/events.py
@@ -120,6 +120,7 @@
 TOOL_INVOKE_PARAMETER_ERROR: Final[str] = "tool.invoke.parameter_error"
 TOOL_INVOKE_SCHEMA_ERROR: Final[str] = "tool.invoke.schema_error"
 TOOL_INVOKE_EXECUTION_ERROR: Final[str] = "tool.invoke.execution_error"
+TOOL_INVOKE_DEEPCOPY_ERROR: Final[str] = "tool.invoke.deepcopy_error"
 TOOL_INVOKE_NON_RECOVERABLE: Final[str] = "tool.invoke.non_recoverable"
 TOOL_INVOKE_VALIDATION_UNEXPECTED: Final[str] = "tool.invoke.validation_unexpected"
 TOOL_BASE_INVALID_NAME: Final[str] = "tool.base.invalid_name"
diff --git a/src/ai_company/providers/models.py b/src/ai_company/providers/models.py
index 1d98c7a616..a267769974 100644
--- a/src/ai_company/providers/models.py
+++ b/src/ai_company/providers/models.py
@@ -81,10 +81,14 @@ class ToolDefinition(BaseModel):
     provider (OpenAI, Anthropic, LiteLLM) consumes it natively.
 
     Note:
-        The ``parameters_schema`` dict is shallowly immutable under the
-        frozen model — reassignment is prevented but contents can still
-        be mutated.  Callers should treat it as read-only or copy before
-        modifying.
+        The ``parameters_schema`` dict is shallowly frozen by Pydantic's
+        ``frozen=True`` — field reassignment is prevented but nested
+        contents can still be mutated in place.  ``BaseTool.to_definition()``
+        provides a deep-copied schema, and ``ToolInvoker`` deep-copies
+        arguments at the execution boundary, so no additional caller-side
+        copying is needed for standard tool/provider workflows.  Direct
+        consumers outside these paths should deep-copy if they intend to
+        modify the schema.  See DESIGN_SPEC.md section 15.5.
 
     Attributes:
         name: Tool name (must be non-blank).
@@ -106,10 +110,11 @@ class ToolCall(BaseModel):
     """A tool invocation requested by the model.
 
     Note:
-        The ``arguments`` dict is shallowly immutable under the frozen
-        model — reassignment is prevented but contents can still be
-        mutated.  Callers should treat it as read-only or copy before
-        modifying.
+        The ``arguments`` dict is shallowly frozen by Pydantic's
+        ``frozen=True`` — field reassignment is prevented but nested
+        contents can still be mutated in place.  The ``ToolInvoker``
+        deep-copies arguments before passing them to tool
+        implementations.  See DESIGN_SPEC.md section 15.5.
 
     Attributes:
         id: Provider-assigned tool call identifier.
diff --git a/src/ai_company/tools/base.py b/src/ai_company/tools/base.py
index 79d93ab75b..db8551736c 100644
--- a/src/ai_company/tools/base.py
+++ b/src/ai_company/tools/base.py
@@ -6,6 +6,7 @@
 
 import copy
 from abc import ABC, abstractmethod
+from types import MappingProxyType
 from typing import Any
 
 from pydantic import BaseModel, ConfigDict, Field
@@ -26,9 +27,12 @@ class ToolExecutionResult(BaseModel):
     to the LLM and is available only for programmatic consumers.
 
     Note:
-        The ``metadata`` dict is shallowly immutable under the frozen
-        model — reassignment is prevented but contents can still be
-        mutated.  Callers should treat it as read-only.
+        The ``metadata`` dict is shallowly frozen by Pydantic's
+        ``frozen=True``.  Tool implementations construct and return
+        this model, but the invoker converts it into a provider-facing
+        ``ToolResult`` — ``metadata`` is not forwarded to LLM providers
+        or other external boundaries, so no additional boundary copy
+        is needed at this layer.
 
     Attributes:
         content: Tool output as a string.
@@ -57,7 +61,8 @@ class BaseTool(ABC):
         name: Non-blank tool name.
         description: Human-readable description of the tool.
         parameters_schema: JSON Schema dict describing expected arguments,
-            or ``None`` if the tool accepts any arguments.
+            or ``None`` if no parameter schema is defined (the invoker
+            skips validation).
     """
 
     def __init__(
@@ -83,8 +88,10 @@ def __init__(
             raise ValueError(msg)
         self._name = name
         self._description = description
-        self._parameters_schema: dict[str, Any] | None = (
-            copy.deepcopy(parameters_schema) if parameters_schema is not None else None
+        self._parameters_schema: MappingProxyType[str, Any] | None = (
+            MappingProxyType(copy.deepcopy(parameters_schema))
+            if parameters_schema is not None
+            else None
         )
 
     @property
@@ -101,9 +108,12 @@ def description(self) -> str:
     def parameters_schema(self) -> dict[str, Any] | None:
         """JSON Schema for tool parameters, or None if unspecified.
 
-        Returns a deep copy to prevent mutation of the internal schema.
+        Returns a deep copy to prevent mutation of internal state.
         """
-        return copy.deepcopy(self._parameters_schema)
+        if self._parameters_schema is None:
+            return None
+        # dict() needed: MappingProxyType cannot be deep-copied directly
+        return copy.deepcopy(dict(self._parameters_schema))
 
     def to_definition(self) -> ToolDefinition:
         """Convert this tool to a ``ToolDefinition`` for LLM providers.
diff --git a/src/ai_company/tools/invoker.py b/src/ai_company/tools/invoker.py
index cfa1324047..3519a2a6e2 100644
--- a/src/ai_company/tools/invoker.py
+++ b/src/ai_company/tools/invoker.py
@@ -1,15 +1,13 @@
 """Tool invoker — validates and executes tool calls.
 
 Bridges LLM ``ToolCall`` objects with concrete ``BaseTool.execute``
-methods.  Never propagates exceptions — always returns a ``ToolResult``.
-
-Note:
-    ``BaseException`` subclasses (``KeyboardInterrupt``, ``SystemExit``,
-    ``asyncio.CancelledError``) are NOT caught and will propagate
-    normally.  Non-recoverable errors (``MemoryError``,
-    ``RecursionError``) are re-raised after logging.
+methods.  Recoverable errors are returned as ``ToolResult(is_error=True)``;
+non-recoverable errors (``MemoryError``, ``RecursionError``) and
+``BaseException`` subclasses (``KeyboardInterrupt``, ``SystemExit``,
+``asyncio.CancelledError``) propagate after logging.
 """
 
+import copy
 from typing import TYPE_CHECKING
 
 import jsonschema
@@ -19,6 +17,7 @@
 
 from ai_company.observability import get_logger
 from ai_company.observability.events import (
+    TOOL_INVOKE_DEEPCOPY_ERROR,
     TOOL_INVOKE_EXECUTION_ERROR,
     TOOL_INVOKE_NON_RECOVERABLE,
     TOOL_INVOKE_NOT_FOUND,
@@ -234,9 +233,40 @@ async def _execute_tool(
         tool: BaseTool,
         tool_call: ToolCall,
     ) -> ToolExecutionResult | ToolResult:
-        """Execute the tool, catching errors as ``ToolResult``."""
+        """Deep-copy arguments for isolation, then execute the tool.
+
+        Copy failures and execution errors are caught and returned as
+        ``ToolResult(is_error=True)``.  Non-recoverable errors
+        (``MemoryError``, ``RecursionError``) propagate after logging.
+        """
+        try:
+            safe_args = copy.deepcopy(tool_call.arguments)
+        except (MemoryError, RecursionError) as exc:
+            logger.exception(
+                TOOL_INVOKE_NON_RECOVERABLE,
+                tool_call_id=tool_call.id,
+                tool_name=tool_call.name,
+                error=f"{type(exc).__name__}: {exc}",
+            )
+            raise
+        except Exception as exc:
+            error_msg = str(exc) or f"{type(exc).__name__} (no message)"
+            logger.exception(
+                TOOL_INVOKE_DEEPCOPY_ERROR,
+                tool_call_id=tool_call.id,
+                tool_name=tool_call.name,
+                error=f"Failed to deep-copy arguments: {error_msg}",
+            )
+            return ToolResult(
+                tool_call_id=tool_call.id,
+                content=(
+                    f"Tool {tool_call.name!r} arguments could not be "
+                    f"safely copied: {error_msg}"
+                ),
+                is_error=True,
+            )
         try:
-            return await tool.execute(arguments=dict(tool_call.arguments))
+            return await tool.execute(arguments=safe_args)
         except (MemoryError, RecursionError) as exc:
             logger.exception(
                 TOOL_INVOKE_NON_RECOVERABLE,
diff --git a/tests/unit/tools/conftest.py b/tests/unit/tools/conftest.py
index 33a5bfd448..a8b32014f1 100644
--- a/tests/unit/tools/conftest.py
+++ b/tests/unit/tools/conftest.py
@@ -185,6 +185,32 @@ async def execute(
         raise ValueError(msg)
 
 
+class _MutatingTool(BaseTool):
+    """Tool that mutates its arguments to test boundary isolation."""
+
+    def __init__(self) -> None:
+        super().__init__(
+            name="mutating",
+            description="Mutates args",
+            parameters_schema={
+                "type": "object",
+                "properties": {
+                    "nested": {"type": "object"},
+                },
+            },
+        )
+
+    async def execute(
+        self,
+        *,
+        arguments: dict[str, Any],
+    ) -> ToolExecutionResult:
+        arguments["injected"] = True
+        if "nested" in arguments:
+            arguments["nested"]["mutated"] = True
+        return ToolExecutionResult(content="mutated")
+
+
 class _RemoteRefTool(BaseTool):
     """Tool with a remote ``$ref`` in its schema (for SSRF testing)."""
 
@@ -278,5 +304,6 @@ def extended_invoker() -> ToolInvoker:
         _InvalidSchemaTool(),
         _EmptyErrorTool(),
         _RemoteRefTool(),
+        _MutatingTool(),
     ]
     return ToolInvoker(ToolRegistry(tools))
diff --git a/tests/unit/tools/test_base.py b/tests/unit/tools/test_base.py
index 0aec649c3c..97a08b77eb 100644
--- a/tests/unit/tools/test_base.py
+++ b/tests/unit/tools/test_base.py
@@ -98,13 +98,29 @@ def test_default_schema_none(self) -> None:
         tool = _ConcreteTool(name="t")
         assert tool.parameters_schema is None
 
-    def test_schema_deep_copied_on_construction(self) -> None:
+    def test_schema_isolated_on_construction(self) -> None:
         props: dict[str, Any] = {"x": {"type": "string"}}
         schema: dict[str, Any] = {"type": "object", "properties": props}
         tool = _ConcreteTool(name="t", parameters_schema=schema)
-        props["y"] = {"type": "integer"}
+        schema["injected"] = True
         assert tool.parameters_schema is not None
-        assert "y" not in tool.parameters_schema["properties"]
+        assert "injected" not in tool.parameters_schema
+
+    def test_schema_nested_isolated_on_construction(self) -> None:
+        schema: dict[str, Any] = {
+            "type": "object",
+            "properties": {"x": {"type": "string"}},
+        }
+        tool = _ConcreteTool(name="t", parameters_schema=schema)
+        schema["properties"]["x"]["type"] = "integer"
+        assert tool.parameters_schema is not None
+        assert tool.parameters_schema["properties"]["x"]["type"] == "string"
+
+    def test_schema_internal_is_read_only(self) -> None:
+        schema = {"type": "object", "properties": {"x": {"type": "string"}}}
+        tool = _ConcreteTool(name="t", parameters_schema=schema)
+        with pytest.raises(TypeError):
+            tool._parameters_schema["injected"] = True  # type: ignore[index]
 
     def test_schema_property_returns_copy(self) -> None:
         schema: dict[str, Any] = {
@@ -118,6 +134,19 @@ def test_schema_property_returns_copy(self) -> None:
         assert tool.parameters_schema is not None
         assert "injected" not in tool.parameters_schema
 
+    def test_schema_property_nested_mutation_isolated(self) -> None:
+        schema: dict[str, Any] = {
+            "type": "object",
+            "properties": {"x": {"type": "string"}},
+        }
+        tool = _ConcreteTool(name="t", parameters_schema=schema)
+        returned = tool.parameters_schema
+        assert returned is not None
+        returned["properties"]["x"]["type"] = "integer"
+        fresh = tool.parameters_schema
+        assert fresh is not None
+        assert fresh["properties"]["x"]["type"] == "string"
+
     def test_to_definition(self) -> None:
         schema = {"type": "object", "properties": {"x": {"type": "string"}}}
         tool = _ConcreteTool(
diff --git a/tests/unit/tools/test_invoker.py b/tests/unit/tools/test_invoker.py
index cc6df2ffd2..b1a0bb1446 100644
--- a/tests/unit/tools/test_invoker.py
+++ b/tests/unit/tools/test_invoker.py
@@ -318,6 +318,103 @@ async def test_remote_ref_does_not_raise(
         assert isinstance(result, ToolResult)
 
 
+@pytest.mark.unit
+class TestInvokeBoundaryIsolation:
+    """Tests that tool execution receives isolated argument copies."""
+
+    async def test_tool_receives_deep_copy_of_arguments(
+        self,
+        extended_invoker: ToolInvoker,
+    ) -> None:
+        """Nested argument structures are isolated from the frozen model."""
+        call = ToolCall(
+            id="c1",
+            name="mutating",
+            arguments={"nested": {"key": "original"}},
+        )
+        await extended_invoker.invoke(call)
+        assert call.arguments["nested"]["key"] == "original"
+        assert "mutated" not in call.arguments.get("nested", {})
+        assert "injected" not in call.arguments
+
+    async def test_nested_mutation_does_not_leak(
+        self,
+        extended_invoker: ToolInvoker,
+    ) -> None:
+        """Tool mutating nested dicts does not affect the original ToolCall."""
+        call = ToolCall(
+            id="c2",
+            name="mutating",
+            arguments={"nested": {"value": 42}},
+        )
+        await extended_invoker.invoke(call)
+        assert "mutated" not in call.arguments.get("nested", {})
+
+
+@pytest.mark.unit
+class TestInvokeDeepcopyFailure:
+    """Tests for argument deep-copy failure handling."""
+
+    async def test_deepcopy_failure_returns_error_result(
+        self,
+        extended_invoker: ToolInvoker,
+        monkeypatch: pytest.MonkeyPatch,
+    ) -> None:
+        """When deepcopy of arguments fails, a ToolResult error is returned."""
+        import copy as _copy_mod
+
+        real_deepcopy = _copy_mod.deepcopy
+        call_count = 0
+
+        def _fail_on_execute(obj: object, memo: object = None) -> object:
+            nonlocal call_count
+            call_count += 1
+            # First deepcopy call is in _validate_params via
+            # parameters_schema; let it pass. Fail on the second
+            # call in _execute_tool.
+            if call_count > 1:
+                msg = "cannot copy"
+                raise TypeError(msg)
+            return real_deepcopy(obj, memo)  # type: ignore[arg-type]
+
+        call = ToolCall(id="c_dc", name="mutating", arguments={"key": "val"})
+        monkeypatch.setattr(
+            "ai_company.tools.invoker.copy.deepcopy",
+            _fail_on_execute,
+        )
+        result = await extended_invoker.invoke(call)
+        assert result.is_error is True
+        assert "safely copied" in result.content
+        assert result.tool_call_id == "c_dc"
+
+    async def test_recursion_error_during_deepcopy_propagates(
+        self,
+        extended_invoker: ToolInvoker,
+        monkeypatch: pytest.MonkeyPatch,
+    ) -> None:
+        """RecursionError during deepcopy is re-raised, not swallowed."""
+        import copy as _copy_mod
+
+        real_deepcopy = _copy_mod.deepcopy
+        call_count = 0
+
+        def _fail_on_execute(obj: object, memo: object = None) -> object:
+            nonlocal call_count
+            call_count += 1
+            if call_count > 1:
+                msg = "maximum recursion depth exceeded"
+                raise RecursionError(msg)
+            return real_deepcopy(obj, memo)  # type: ignore[arg-type]
+
+        call = ToolCall(id="c_rec", name="mutating", arguments={"key": "val"})
+        monkeypatch.setattr(
+            "ai_company.tools.invoker.copy.deepcopy",
+            _fail_on_execute,
+        )
+        with pytest.raises(RecursionError, match="maximum recursion depth"):
+            await extended_invoker.invoke(call)
+
+
 @pytest.mark.unit
 class TestInvokeEmptyErrorMessage:
     """Tests for empty exception message fallback."""