Skip to content
Merged
5 changes: 4 additions & 1 deletion docs/design/integrations.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,10 @@ providers, notification sinks, tools) and provides:

Central registry for external service connections. Each connection has a
unique name, a typed connection type, encrypted credentials (via `SecretRef`),
and optional rate limiting and health check configuration.
optional rate limiting and health check configuration, and a `sensitive`
flag. When `sensitive` is set, the governed external-access tool routes every
call against the connection (read or write) to human approval, not only write
methods.

### Connection Types

Expand Down
8 changes: 5 additions & 3 deletions docs/design/tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ description: Tool categories, concurrent execution model, layered sandboxing, MC

This page is the source of truth for the **designed** behaviour of this subsystem. Tool execution runs via the agent runtime, which is in active development (see the [Roadmap](../roadmap/index.md)); the code described here is built and unit-tested as components but not yet run by a live agent.

Agents act on the world through tools. SynthOrg defines a pluggable tool system with 13+ categories (file system, git, web, database, terminal, sandbox, MCP bridge, analytics, communication, design, headless browser), layered sandboxing (subprocess for low-risk, Docker for high-risk, Kubernetes for future multi-tenant), MCP server integration, and a progressive-disclosure model that limits the surface an agent sees to what its role, seniority, and autonomy tier permit.
Agents act on the world through tools. SynthOrg defines a pluggable tool system with 14+ categories (file system, git, web, database, terminal, sandbox, MCP bridge, analytics, communication, design, headless browser, governed external data access), layered sandboxing (subprocess for low-risk, Docker for high-risk, Kubernetes for future multi-tenant), MCP server integration, and a progressive-disclosure model that limits the surface an agent sees to what its role, seniority, and autonomy tier permit.

## Tool Categories

Expand All @@ -27,6 +27,7 @@ Agents act on the world through tools. SynthOrg defines a pluggable tool system
| **Deployment** | CI/CD, container management | DevOps, SRE |
| **Memory** | Search memory, recall by ID | All agents (tool-based strategy) |
| **Browser** | Headless Playwright + Chromium: navigate, screenshot, SSIM diff, axe accessibility scan, full spec | QA, frontend devs, agents validating web deliverables |
| **External Data** | Governed external API/data access through a configured connection: credentials brokered from the connection catalog, egress constrained to the connection host (SSRF policy + DNS pinning), per-connection rate limiting, sensitive/write calls gated to approval | Agents consuming third-party APIs while building deliverables |
| **MCP Servers** | Any MCP-compatible tool | Configurable per agent |

## Tool Execution Model
Expand Down Expand Up @@ -450,7 +451,7 @@ Action types classify agent actions for use by autonomy presets (see [Security &
SecOps validation, tiered timeout policies, and progressive trust
([Decision Log](../architecture/decisions.md) D1).

**Registry:** `StrEnum` for ~31 built-in action types (type safety, autocomplete, typos caught
**Registry:** `StrEnum` for ~32 built-in action types (type safety, autocomplete, typos caught
by static type checking and config-load-time validation) + `ActionTypeRegistry` for custom
types via explicit registration. Unknown strings are rejected at config load time; a typo
in `human_approval` list silently meaning "skip approval" is a critical safety concern.
Expand All @@ -459,7 +460,7 @@ in `human_approval` list silently meaning "skip approval" is a critical safety c
actions in that category (e.g., `auto_approve: ["code"]` expands to all `code:*` actions).
Fine-grained overrides are supported (e.g., `human_approval: ["code:create"]`).

**Taxonomy (~31 leaf types):**
**Taxonomy (~32 leaf types):**

```text
code:read, code:write, code:create, code:delete, code:refactor
Expand All @@ -474,6 +475,7 @@ db:query, db:mutate, db:admin
arch:decide
memory:read
browser:navigate, browser:screenshot, browser:diff, browser:accessibility_scan, browser:spec
external_data:request
```

**Classification:** Static tool metadata. Each `BaseTool` declares its `action_type`. Default
Expand Down
1 change: 1 addition & 0 deletions scripts/schema_drift_baseline.txt
Original file line number Diff line number Diff line change
Expand Up @@ -195,3 +195,4 @@ index_attr:idx_ftc_single_active:where:is_active = 1:is_active = TRUE:Partial-un
nullable:cost_records:rowid:N:Y:SQLite uses INTEGER PRIMARY KEY AUTOINCREMENT (NOT NULL); Postgres uses BIGINT GENERATED ALWAYS AS IDENTITY which can be NULL until insert assigns the sequence value (composite PK with timestamp enforces non-null at write time)
pk:audit_entries:id:id,timestamp:Postgres composite PK includes the timestamp partitioning column for TimescaleDB hypertable conversion; SQLite has no hypertable concept and uses the simpler PK on id alone (see schema header)
pk:cost_records:rowid:rowid,timestamp:Postgres composite PK includes the timestamp partitioning column for TimescaleDB hypertable conversion; SQLite has no hypertable concept (see schema header)
column:approvals:consumed_at:TEXT:TIMESTAMPTZ:SQLite has no TIMESTAMPTZ; column stores TEXT carrying ISO-8601 with explicit +00:00/Z suffix, normalised to UTC at write time
226 changes: 226 additions & 0 deletions src/synthorg/api/_approval_expiration.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
"""Lazy-expiration behaviour for :class:`ApprovalStore`.

The PENDING -> EXPIRED lazy transition (scalar ``_check_expiration_locked``,
the pure batch ``_compute_expiration`` / ``_compute_page`` companions, the
cache-only list path, and the best-effort expire callback) is a cohesive
slice of the store. It lives in its own mixin so the main store module
stays focused on the CRUD + CAS + cache-coherency concurrency model.

The mixin reaches back into the host store for shared state (``_clock``,
``_repo``, ``_items``, ``_on_expire``); the ``TYPE_CHECKING`` block below
declares that surface so ``mypy`` type-checks the mixin in isolation.
"""

from typing import TYPE_CHECKING

from synthorg.core.approval import ApprovalItem # noqa: TC001
from synthorg.core.enums import ApprovalRiskLevel, ApprovalStatus
from synthorg.observability import get_logger, safe_error_description
from synthorg.observability.events.api import (
API_APPROVAL_EXPIRE_CALLBACK_FAILED,
API_APPROVAL_EXPIRED,
)
from synthorg.observability.events.approval_gate import (
APPROVAL_STATUS_TRANSITIONED,
)
from synthorg.observability.metrics_hub import record_approval_decision

if TYPE_CHECKING:
from collections.abc import Callable

from synthorg.core.clock import Clock
from synthorg.core.types import NotBlankStr
from synthorg.persistence.approval_protocol import ApprovalRepository

logger = get_logger(__name__)


class ApprovalExpirationMixin:
"""Lazy-expiration methods mixed into :class:`ApprovalStore`."""

if TYPE_CHECKING:
_clock: Clock
_repo: ApprovalRepository | None
_items: dict[str, ApprovalItem]
_on_expire: Callable[[ApprovalItem], None] | None

def _compute_page(
self,
page: tuple[ApprovalItem, ...],
*,
status: ApprovalStatus | None,
risk_level: ApprovalRiskLevel | None,
) -> tuple[
list[ApprovalItem],
list[ApprovalItem],
dict[str, ApprovalItem],
]:
"""Pure: classify a repo page into (filtered, to_persist, page_cache).

Companion to :meth:`ApprovalStore._list_from_repo`. Walks ``page``
once, computing lazy expiration via :meth:`_compute_expiration` and
applying caller-supplied filters. No I/O, no lock acquisition.

``page_cache`` carries every row from the page (with the
possibly-EXPIRED replacement substituted in) so the caller
can refresh the entire page slice in ``_items``, not just the
EXPIRED transitions. ``to_persist`` carries only the rows
that flipped locally, which is the candidate set the caller
feeds to ``expire_if_pending`` for the compare-and-set.
"""
page_result: list[ApprovalItem] = []
to_persist: list[ApprovalItem] = []
page_cache: dict[str, ApprovalItem] = {}
for item in page:
checked = self._compute_expiration(item)
page_cache[item.id] = checked
if checked is not item:
to_persist.append(checked)
if status is not None and checked.status != status:
continue
if risk_level is not None and checked.risk_level != risk_level:
continue
page_result.append(checked)
return page_result, to_persist, page_cache

async def _list_from_cache_locked(
self,
*,
status: ApprovalStatus | None,
risk_level: ApprovalRiskLevel | None,
action_type: NotBlankStr | None,
) -> tuple[ApprovalItem, ...]:
"""Cache-only list path (no repository wired).

Falls through ``_check_expiration_locked`` per item because
without a repository there is no batch endpoint to amortise;
a per-item save is also a no-op (the in-memory cache is
already updated by ``_check_expiration_locked``).
"""
checked_items: list[ApprovalItem] = []
for stored in list(self._items.values()):
checked = await self._check_expiration_locked(stored)
if status is not None and checked.status != status:
continue
if risk_level is not None and checked.risk_level != risk_level:
continue
if action_type is not None and checked.action_type != action_type:
continue
checked_items.append(checked)
return tuple(checked_items)

async def _check_expiration_locked(
self,
item: ApprovalItem,
) -> ApprovalItem:
"""Lazy expiration, assuming ``self._lock`` is held.

If the item is PENDING and has expired, transition it to
EXPIRED in both the cache and the repository. Callers MUST
hold ``self._lock``; the method performs cache + repo mutations
without re-acquiring it.

Args:
item: The item to check.

Returns:
The original or expired item.
"""
if (
item.status == ApprovalStatus.PENDING
and item.expires_at is not None
and self._clock.now() >= item.expires_at
):
expired = item.model_copy(
update={"status": ApprovalStatus.EXPIRED},
)
if self._repo is not None:
await self._repo.save(expired)
self._items[item.id] = expired
# State-transition log fires AFTER persistence + cache
# update succeed so the audit stream only records hops
# that actually landed. Pairs with the
# APPROVAL_STATUS_TRANSITIONED emissions on PENDING ->
# APPROVED / REJECTED in ``api/controllers/approvals.py``;
# ``API_APPROVAL_EXPIRED`` below is the terminal-state
# summary event that subscribers can use as a single
# signal that an approval has expired.
logger.info(
APPROVAL_STATUS_TRANSITIONED,
approval_id=item.id,
from_status=ApprovalStatus.PENDING.value,
to_status=ApprovalStatus.EXPIRED.value,
)
logger.info(
API_APPROVAL_EXPIRED,
approval_id=item.id,
)
record_approval_decision(outcome="expired")
if self._on_expire is not None:
try:
self._on_expire(expired)
except MemoryError, RecursionError:
raise
except Exception as exc:
# ERROR (matching ``_fire_expire_callback``): the
# approval is already EXPIRED in cache + repo, so
# the callback failure can't unwind the expiration,
# but a dropped downstream side effect (webhook,
# audit dispatch, workflow resume) is operationally
# meaningful and operators must be able to alert
# on it. Both paths emit at ERROR so alerting is
# not sensitive to which expiration path fired.
logger.error(
API_APPROVAL_EXPIRE_CALLBACK_FAILED,
approval_id=item.id,
error_type=type(exc).__name__,
error=safe_error_description(exc),
)
return expired
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated
return item

def _compute_expiration(self, item: ApprovalItem) -> ApprovalItem:
"""Pure: return the (possibly-EXPIRED) item without I/O.

Companion to ``_check_expiration_locked`` for the batch path
in :meth:`ApprovalStore.list_items`. Returns the input unchanged
when no transition applies, or a fresh EXPIRED copy otherwise.
Persistence + audit logging + callback fire AFTER the batch
save in the caller, not here -- this method must be safe to
call inside a tight loop with no side effects.
"""
if (
item.status == ApprovalStatus.PENDING
and item.expires_at is not None
and self._clock.now() >= item.expires_at
):
return item.model_copy(update={"status": ApprovalStatus.EXPIRED})
return item

def _fire_expire_callback(self, expired: ApprovalItem) -> None:
"""Best-effort fire of ``_on_expire`` for a batched expiration.

Mirrors the callback handling in
:meth:`_check_expiration_locked`: a callback failure must not
unwind the expiration (the row is already EXPIRED in cache +
repo); emit ``API_APPROVAL_EXPIRE_CALLBACK_FAILED`` so
operators can filter callback failures from real expirations.
"""
if self._on_expire is None:
return
try:
self._on_expire(expired)
except MemoryError, RecursionError:
raise
except Exception as exc:
# ERROR rather than WARNING: the approval is already
# EXPIRED in cache + repo, so the callback can't
# propagate, but a failed downstream side effect (webhook,
# audit dispatch, workflow resume) is operationally
# meaningful and operators must be able to alert on it.
logger.error(
API_APPROVAL_EXPIRE_CALLBACK_FAILED,
approval_id=expired.id,
error_type=type(exc).__name__,
error=safe_error_description(exc),
)
Loading
Loading