Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
9167106
fix: full governance enforcement online (#1957)
Aureliolo May 18, 2026
1005518
fix: address pre-PR review findings for governance wiring (#1957)
Aureliolo May 18, 2026
89f8131
fix: drop taxonomy shorthand from governance review-fix comments (#1957)
Aureliolo May 18, 2026
fe5951d
fix: babysit round 1, 9 findings (4 coderabbit, 3 gemini, 2 ci)
Aureliolo May 18, 2026
9c7cfdc
fix: babysit round 2, 1 finding (1 coderabbit, 0 ci)
Aureliolo May 18, 2026
6749802
chore: regenerate runtime_stats.yaml after rebase onto main
Aureliolo May 18, 2026
6cab3c1
fix: babysit round 3 part 1, bounded reviewer fixes (4 coderabbit)
Aureliolo May 18, 2026
7b62a10
feat: babysit round 3 part 2, enforce autonomy strategy verdict (1 co…
Aureliolo May 18, 2026
55b951d
feat: babysit round 3 part 3, deterministic approval routing (1 coder…
Aureliolo May 18, 2026
de94a26
fix: babysit round 3 part 4, mypy no-any-return on _store helper
Aureliolo May 18, 2026
5ba202c
fix: babysit round 3 part 5, approval source persistence repairs
Aureliolo May 18, 2026
1de2352
fix: babysit round 3 part 6, resilient config_resolver fallback in _w…
Aureliolo May 18, 2026
ea6d6f9
fix: babysit round 4, 8 findings (2 coderabbit, 6 ci)
Aureliolo May 18, 2026
310cbb5
fix: babysit round 5, 1 finding (1 coderabbit)
Aureliolo May 18, 2026
aa6ae47
fix: babysit round 6, 6 findings (6 coderabbit)
Aureliolo May 18, 2026
b6e1da2
fix: babysit round 6 part 2, pre-push fixes (mypy + eslint)
Aureliolo May 18, 2026
e2e0356
fix: babysit round 7, 3 findings (3 coderabbit)
Aureliolo May 18, 2026
4bcc673
fix: babysit round 7 part 2, drop round-N back-ref from registry comment
Aureliolo May 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions data/runtime_stats.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
schema_version: 1
last_generated_utc: '2026-05-18T11:58:09Z'
generator_revision: 958c3bae6
last_generated_utc: '2026-05-18T12:47:31Z'
generator_revision: 4d98ed24a
stats:
tests:
raw: 31200
raw: 31240
rounded: 31000
display: 31,000+
mem0_stars:
raw: 56017
raw: 56019
rounded: 56000
display: 56k+
providers_curated:
Expand Down
24 changes: 18 additions & 6 deletions docs/design/security.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ description: Approval workflow, autonomy levels, security operations agent, outp

# Security & Approval System

!!! warning "Designed behaviour; runtime in active development"
!!! info "Runtime enforcement"

This page is the source of truth for the **designed** behaviour of this subsystem. The approval producer and runtime enforcement run with the agent runtime, which is in active development (see the [Roadmap](../roadmap/index.md)); the code described here is built and unit-tested as components but not yet enforced on a live run.
This page is the source of truth for the behaviour of this subsystem. Governance runs on the live agent runtime behind the provider-present switch: the approval producer parks blocked actions, the boot `ApprovalGate` resumes them on a decision, the progressive-trust strategy narrows tool access at the invoker, an agent can call SynthOrg's own MCP tools under its trust level with the admin guardrails fail-closed, and the autonomy controller routes changes through the configured `AutonomyChangeStrategy`.

SynthOrg enforces a fail-closed security model: every agent action is evaluated by a rule engine (with an optional LLM fallback) before execution, every output is scanned for leaked secrets, and every credential flows through an isolated **hands** plane that never enters the model context. Four configurable autonomy levels (`full`, `semi`, `supervised`, `locked`) control which actions require human approval, and a pluggable trust system lets agents earn higher tool access over time.

Expand Down Expand Up @@ -95,10 +95,22 @@ signal providers that cannot live in frozen config).
`change_strategy_factory.build_autonomy_change_strategy(config, deps)`
dispatches via the `StrEnum`-keyed `StrategyRegistry`; a wrapping
strategy missing its required signal provider raises
`AutonomyStrategyConfigError` at construction. No production seam wires
a non-default strategy yet (the autonomy controller path constructs no
strategy); operators opt in by configuring it -- the surface is the
deliverable, end-to-end production wiring is the natural follow-up.
`AutonomyStrategyConfigError` at construction. The strategy is built
at boot from `config.autonomy.change_strategy` and attached to
application state; the autonomy controller consults it on every
change request (the D6 seniority rule is enforced first, then the
request is enqueued as an approval, the queue being the apply
driver). With the `HUMAN_ONLY` default every promotion pends for
human review. The strategy verdict is enforced, not audit-only: a
strategy that returns `True` from `request_promotion` produces an
auto-decided approval item (`status=APPROVED`,
`decided_by="strategy:<name>"`, `decided_at` set) and the registry
applies the level change immediately, so the queue remains the apply
driver and the audit trail stays intact while a non-`HUMAN_ONLY`
strategy actually takes effect. The performance / risk-budget signal
providers the `PERFORMANCE_GATED` and `BUDGET_AWARE` strategies
require are not wired by the boot seam: selecting one of those kinds
without supplying its provider fails fast at construction.

## Security Operations Agent

Expand Down
4 changes: 4 additions & 0 deletions scripts/_ghost_wiring_manifest.txt
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,10 @@

ENFORCED AgentEngine #1956 -- runtime root; construct at boot behind the provider switch
ENFORCED build_coordinator #1958 -- called by workers.runtime_builder.build_runtime_services behind the provider switch
ENFORCED ApprovalGate #1957 -- one gate wired at boot in lifecycle_builder, injected into AgentEngine; engine parks, /approvals resumes
ENFORCED TrustService #1957 -- built at boot (non-DISABLED strategy), injected into AgentEngine; narrows tool permissions at the invoker seam
ENFORCED build_mcp_self_consumer #1957 -- called in runtime_builder; agent invokes SynthOrg's own MCP tools trust-scoped with actor fail-closed
ENFORCED build_autonomy_change_strategy #1957 -- built at boot in app_builders, attached to AppState; autonomy controller routes through it
PENDING BaselineStore #1959 -- construct at boot (window from budget.baseline_window_size)
PENDING CoordinationMetricsCollector #1959 -- construct at boot, thread into execution
ENFORCED IntakeEngine #1961 -- wired at boot via client/runtime_builder.build_client_simulation_runtime
5 changes: 5 additions & 0 deletions src/synthorg/api/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
from synthorg import __version__
from synthorg.api.app_builders import (
_bootstrap_app_logging,
_build_configured_autonomy_change_strategy,
_build_configured_trust_service,
_build_performance_tracker,
_build_telemetry_collector,
Expand Down Expand Up @@ -528,6 +529,9 @@ def create_app( # noqa: C901, PLR0912, PLR0913, PLR0915
)
if trust_service is None:
trust_service = _build_configured_trust_service(effective_config.trust)
autonomy_change_strategy = _build_configured_autonomy_change_strategy(
effective_config.config.autonomy,
)

# One boot clock shared between the uptime baseline and AppState so
# ``app_state.clock`` and ``startup_time`` cannot diverge, and a
Expand Down Expand Up @@ -557,6 +561,7 @@ def create_app( # noqa: C901, PLR0912, PLR0913, PLR0915
notification_dispatcher=notification_dispatcher,
audit_log=audit_log,
trust_service=trust_service,
autonomy_change_strategy=autonomy_change_strategy,
coordination_metrics_store=coordination_metrics_store,
event_stream_hub=event_stream_hub or EventStreamHub(),
interrupt_store=interrupt_store or InterruptStore(),
Expand Down
30 changes: 30 additions & 0 deletions src/synthorg/api/app_builders.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@
from synthorg.meta.chief_of_staff.chat import ChiefOfStaffChat
from synthorg.meta.chief_of_staff.config import ChiefOfStaffConfig
from synthorg.providers.registry import ProviderRegistry
from synthorg.security.autonomy.models import AutonomyConfig
from synthorg.security.autonomy.protocol import AutonomyChangeStrategy
from synthorg.security.trust.config import TrustConfig
from synthorg.security.trust.service import TrustService

Expand Down Expand Up @@ -204,6 +206,34 @@ def _build_configured_trust_service(
return TrustService(strategy=strategy, config=trust_config)


def _build_configured_autonomy_change_strategy(
autonomy_config: AutonomyConfig,
) -> AutonomyChangeStrategy:
"""Construct the configured autonomy-change strategy.

Always returns a strategy (default ``kind=HUMAN_ONLY``): every
promotion request then routes through human approval. The
``HUMAN_ONLY`` default needs no signal providers; the
performance / risk-budget signals required by the
``PERFORMANCE_GATED`` / ``BUDGET_AWARE`` opt-in strategies are
deliberately not wired here (per the Security design spec the
selectable surface is the deliverable and the factory fails fast
at construction if a non-default kind is configured without its
required signal provider).
"""
from synthorg.security.autonomy.change_strategy_config import ( # noqa: PLC0415
AutonomyStrategyDeps,
)
from synthorg.security.autonomy.change_strategy_factory import ( # noqa: PLC0415
build_autonomy_change_strategy,
)

return build_autonomy_change_strategy(
autonomy_config.change_strategy,
AutonomyStrategyDeps(),
)


def _allowed_memory_dir_roots() -> tuple[str, ...]:
r"""Return the string roots a memory dir must begin with.

Expand Down
158 changes: 124 additions & 34 deletions src/synthorg/api/controllers/_approval_review_gate.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@

from synthorg.core.actor_context import resolve_decided_by
from synthorg.core.domain_errors import (
AgentRuntimeNotConfiguredError,
ConflictError,
ForbiddenError,
NotFoundError,
Expand All @@ -28,7 +29,6 @@
)
from synthorg.observability import get_logger, safe_error_description
from synthorg.observability.events.approval_gate import (
APPROVAL_GATE_RESUME_CONTEXT_LOADED,
APPROVAL_GATE_RESUME_FAILED,
APPROVAL_GATE_RESUME_TRIGGERED,
APPROVAL_GATE_REVIEW_TRANSITION_FAILED,
Expand All @@ -40,52 +40,140 @@

if TYPE_CHECKING:
from synthorg.api.state import AppState
from synthorg.engine.approval_gate import ApprovalGate
from synthorg.core.approval import ApprovalItem
from synthorg.engine.review_gate import ReviewGateService

logger = get_logger(__name__)


async def _reread_approval_item(
app_state: AppState,
approval_id: str,
) -> ApprovalItem | None:
"""Re-read the just-decided approval, degrading to ``None`` on error.

The decision is already persisted by the caller; a failed reread
must not 500 the request. Returning ``None`` routes the caller to
the parked-context probe fallback instead of a hard dependency.
"""
try:
return await app_state.approval_store.get(approval_id)
except MemoryError, RecursionError:
raise
except Exception as exc:
logger.warning(
APPROVAL_GATE_RESUME_FAILED,
approval_id=approval_id,
error_type=type(exc).__name__,
error=safe_error_description(exc),
note="approval reread failed; falling back to parked-context probe",
)
return None


async def try_mid_execution_resume(
approval_gate: ApprovalGate,
app_state: AppState,
approval_id: str,
*,
approved: bool,
decided_by: str,
decision_reason: str | None,
) -> bool:
"""Attempt to resume a mid-execution parked context.
"""Dispatch a parked-context resume if one exists for this approval.

Cheap non-destructive existence peek
(:meth:`ApprovalGate.has_parked_context`) decides the flow without
consuming the parked record or emitting the resume-started audit
event. When a parked context exists the actual restore + agent
re-run is delegated to the worker execution service, which spawns
it as a tracked background task so the approve/reject HTTP response
is not blocked by a full agent re-run (the decision is already
persisted by the caller before this runs).

Returns ``True`` if the flow was handled (context found or
error -- caller should not fall through to the review gate).
Returns ``False`` if no parked context exists.
Routing is deterministic off the approval's persisted
:attr:`ApprovalItem.source` discriminator (fixed at creation), not
a live parked-context probe: ``PARKED_CONTEXT`` means this flow
owns the decision, anything else falls through to the review gate.
The legacy ``has_parked_context`` probe is kept only as a logged
fallback for the degenerate case where the just-decided approval
cannot be re-read (it should always be present here, since the
caller persisted the decision immediately before).

Returns ``True`` when the mid-execution flow is responsible for
this approval so the caller does not also run the review-gate
transition. Returns ``False`` when the approval is review-gate
bound (e.g. a hiring/promotion approval) so the caller falls
through to the review gate.
"""
from synthorg.core.enums import ApprovalSource # noqa: PLC0415

item = await _reread_approval_item(app_state, approval_id)
if item is not None:
# Deterministic primary path: the source was fixed when the
# approval was created, so routing cannot flip on a transient
# parked-context backend outage.
if item.source is not ApprovalSource.PARKED_CONTEXT:
return False
else:
# Fallback only: the decision was just persisted by the caller,
# so a missing item is unexpected. Probe the gate to avoid
# stranding a possibly-parked approval in the review gate.
gate = app_state.approval_gate
if gate is None:
return False
try:
has_parked = await gate.has_parked_context(approval_id)
except MemoryError, RecursionError:
raise
except Exception as exc:
logger.warning(
APPROVAL_GATE_RESUME_FAILED,
approval_id=approval_id,
error_type=type(exc).__name__,
error=safe_error_description(exc),
note="approval item missing; parked-context probe failed",
)
# Indeterminate: a parked context may still exist, so do
# NOT fall through to the review gate (double-handle).
return True
if not has_parked:
return False
try:
resumed = await approval_gate.resume_context(approval_id)
await app_state.worker_execution_service.dispatch_resume(
approval_id=approval_id,
approved=approved,
decided_by=decided_by,
decision_reason=decision_reason,
)
except MemoryError, RecursionError:
raise
except Exception:
logger.warning(
except AgentRuntimeNotConfiguredError:
# A runtime-misconfiguration failure means the parked run can
# NEVER resume (no engine/provider to resume into). Swallowing
# it and returning True would mark the approval handled while
# the work is silently stranded. Propagate so the controller
# surfaces the real error instead of a false success.
logger.error(
APPROVAL_GATE_RESUME_FAILED,
approval_id=approval_id,
error="Failed to resume parked context",
note="resume dispatch failed -- runtime not configured",
)
# Resume lookup failed -- do NOT fall through to review
# gate, because the parked context may still exist.
return True

if resumed is not None:
_context, parked_id = resumed
logger.info(
APPROVAL_GATE_RESUME_CONTEXT_LOADED,
raise
except Exception as exc:
# A transient dispatch failure (e.g. background-spawn hiccup)
# must not 5xx the approve/reject response and must still
# suppress the review-gate fall-through (the parked record is
# intact -- resume_context has not run on this path -- so the
# operator can re-trigger). Distinct from the hard
# runtime-misconfiguration case re-raised above.
logger.error(
APPROVAL_GATE_RESUME_FAILED,
approval_id=approval_id,
parked_id=parked_id,
approved=approved,
note=(
"Parked context loaded -- agent re-execution "
"requires external orchestration"
),
error_type=type(exc).__name__,
error=safe_error_description(exc),
note="resume dispatch failed",
)
return True
return False
return True
Comment thread
coderabbitai[bot] marked this conversation as resolved.


async def preflight_review_gate(
Expand Down Expand Up @@ -264,13 +352,15 @@ async def signal_resume_intent( # noqa: PLR0913
)

# Flow 1: mid-execution parking.
approval_gate = app_state.approval_gate
if approval_gate is not None:
handled = await try_mid_execution_resume(
approval_gate, approval_id, approved=approved
)
if handled:
return
handled = await try_mid_execution_resume(
app_state,
approval_id,
approved=approved,
decided_by=decided_by,
decision_reason=decision_reason,
)
if handled:
return

# Flow 2: review gate -- transition task status.
review_gate = app_state.review_gate_service
Expand Down
Loading
Loading