Aureliolo · Aureliolo · May 15, 2026 · May 15, 2026 · May 15, 2026 · May 15, 2026
@@ -19,7 +19,7 @@ SynthOrg is a Python framework for building **synthetic organizations**, autonom
 
 Define your company in YAML. Agents collaborate through a message bus, follow workflows (Kanban, Agile sprints, or custom), track costs against budgets, and produce real artifacts. The framework is provider-agnostic (<!--RS:providers_via_litellm-->2700+<!--/RS--> LLMs via [LiteLLM](https://github.com/BerriAI/litellm)), configuration-driven ([Pydantic v2](https://docs.pydantic.dev/) models), and designed for the full autonomy spectrum, from human approval on every action to fully autonomous operation.
 
-> **Early access.** Core subsystems are built and tested (<!--RS:tests-->29,000+<!--/RS--> tests, 80%+ coverage). APIs may change between releases. See the [roadmap](https://synthorg.io/docs/roadmap/) for what's next.
+> **Early access.** Core subsystems are built and tested (<!--RS:tests-->30,000+<!--/RS--> tests, 80%+ coverage). APIs may change between releases. See the [roadmap](https://synthorg.io/docs/roadmap/) for what's next.
 
 ## Why SynthOrg?
 

@@ -113,8 +113,8 @@ competitors:
       memory: {support: full, note: "Pluggable architecture with 3 backends (Mem0, composite, in-memory); 5 memory types (working, episodic, semantic, procedural, social); hybrid retrieval (dense + BM25 sparse, linear-weighted fusion)"}
       tool_use: {support: full, note: "MCP protocol, sandbox isolation, invocation tracking; 14 ToolCategory values (file_system, code_execution, version_control, web, database, terminal, design, communication, analytics, deployment, memory, ontology, mcp, other)"}
       human_in_loop: {support: full, note: "Approval gates, review workflows, 4 autonomy tiers (FULL, SEMI, SUPERVISED, LOCKED), two-stage safety classifier, LLM fallback evaluator"}
-      budget_tracking: {support: partial, note: "Per-token, per-agent, hierarchical cascades, CFO optimization, and automatic model downgrade shipped; risk-unit action budgets still in progress"}
-      security_model: {support: partial, note: "Rule engine + LLM evaluator, progressive trust, audit trail, output scanning, and hallucination detection shipped; self-healing SSRF still in progress"}
+      budget_tracking: {support: full, note: "Per-token, per-agent, hierarchical cascades, CFO optimization, automatic model downgrade, and risk-unit action budgets"}
+      security_model: {support: full, note: "Rule engine + LLM evaluator, progressive trust, audit trail, output scanning, hallucination detection, and self-healing SSRF validation"}
       observability: {support: full, note: "Structured logging, correlation tracking, log shipping, redaction, Prometheus metrics, OTLP"}
       web_dashboard: {support: full, note: "React 19 dashboard with org chart, tasks, budgets, workflow editor, setup wizard, WebSocket + SSE resilience"}
       cli: {support: partial, note: "Go binary for container management and verification; not used for agent orchestration"}

@@ -1,18 +1,18 @@
 schema_version: 1
-last_generated_utc: '2026-05-13T19:13:08Z'
-generator_revision: 3e69976a8
+last_generated_utc: '2026-05-15T15:49:25Z'
+generator_revision: d29621464
 stats:
   tests:
-    raw: 30090
+    raw: 30536
     rounded: 30000
     display: 30,000+
   mem0_stars:
-    raw: 55598
+    raw: 55792
     rounded: 55000
     display: 55k+
   providers_curated:
-    raw: 19
-    display: '19'
+    raw: 20
+    display: '20'
   providers_via_litellm:
     raw: 2708
     display: 2700+
@@ -25,7 +25,7 @@ stats:
 sources:
   tests: uv run python -m pytest --collect-only -q
   mem0_stars: gh api repos/mem0ai/mem0 --jq .stargazers_count
-  providers_curated: synthorg.providers.presets.list_presets
+  providers_curated: synthorg.providers.presets.list_featured_presets
   providers_via_litellm: len(litellm.model_cost)
   subagents: glob .claude/agents/*.md
   convention_gates: glob scripts/check_*.py
@@ -103,7 +103,7 @@ Four enforcement strategies are defined behind `QuadraticEnforcementStrategy`:
 | `hard_block` | Reject new connections when `max_agent_connections` exceeded |
 | `disabled` | No detection or enforcement |
 
-Only `alert_only` ships today; the other three are defined in config but not yet implemented. See [Security -> Quadratic Communication Enforcement](security.md#quadratic-communication-enforcement).
+`alert_only` is the shipped enforcement strategy. `soft_throttle`, `hard_block`, and `disabled` are defined in config; the dispatch shape is in place but the per-mode behaviour is not yet wired into `MessageBus.publish`. See [Security -> Quadratic Communication Enforcement](security.md#quadratic-communication-enforcement).
 
 ## Configuration Summary
 

@@ -293,16 +293,25 @@ human decision.
 
 !!! info "Design decisions ([Decision Log](../architecture/decisions.md) D9, D10)"
 
+    Each decision below names the protocol that ships today and the
+    concrete `Initial strategy` that the default factory wires. "Initial
+    strategy" is the shipped default, not aspirational scaffolding;
+    operators replace it by registering an alternative strategy on the
+    relevant factory.
+
     - **D9: Task Reassignment.** Pluggable `TaskReassignmentStrategy` protocol. Initial
-      strategy: queue-return; tasks return to unassigned queue, existing `TaskRoutingService`
-      re-routes with priority boost for reassigned tasks. Future strategies:
-      same-department/lowest-load, manager-decides (LLM), HR agent decides.
+      strategy: queue-return (concrete: `QueueReturnStrategy` in
+      `src/synthorg/hr/queue_return_strategy.py`); tasks return to unassigned queue,
+      existing `TaskRoutingService` re-routes with priority boost for reassigned tasks.
+      Future strategies on the backlog: same-department / lowest-load, manager-decides
+      (LLM), HR agent decides.
     - **D10: Memory Archival.** Pluggable `MemoryArchivalStrategy` protocol. Initial
-      strategy: full snapshot, read-only. Pipeline: retrieve all memories, archive to
-      `ArchivalStore`, selectively promote semantic+procedural memories to
+      strategy: full snapshot, read-only (concrete: `FullSnapshotStrategy` in
+      `src/synthorg/hr/full_snapshot_strategy.py`). Pipeline: retrieve all memories,
+      archive to `ArchivalStore`, selectively promote semantic+procedural memories to
       `OrgMemoryBackend` (rule-based), clean hot store, mark agent TERMINATED. Rehiring
-      restores archived memories into a new `AgentIdentity`. Future strategies: selective
-      discard, full-accessible.
+      restores archived memories into a new `AgentIdentity`. Future strategies on the
+      backlog: selective discard, full-accessible.
 
 ## Performance Tracking
 
@@ -486,6 +495,11 @@ Agents can move between seniority levels based on performance:
 
 !!! info "Design decisions ([Decision Log](../architecture/decisions.md) D13, D14, D15)"
 
+    Each decision below names the protocol that ships today and the
+    concrete `Initial strategy` that the default factory wires. "Initial
+    strategy" is the shipped default; operators substitute via the
+    factory.
+
     - **D13: Promotion Criteria.** Pluggable `PromotionCriteriaStrategy` protocol. Initial
       strategy: configurable threshold gates. `ThresholdEvaluator` with
       `min_criteria_met: int` (N of M) + `required_criteria: list[str]`. Setting `min=total`

@@ -289,11 +289,11 @@ placeholder factories:
   through `safe_error_description(exc)` (SEC-1) and `domain_code` falls back
   to `exc.domain_code` when present.
 - `not_supported(tool_name, reason)`: stable `status="error"` /
-  `domain_code="not_supported"` envelope for tools whose service facade is
-  not yet wired. Emits the `MCP_HANDLER_NOT_IMPLEMENTED` WARNING event so
-  operators can alert on unwired tools. After META-MCP-2 every tool is
-  wired, so this path only fires for tools registered after PR1 that have
-  not yet been given a concrete handler.
+  `domain_code="not_supported"` envelope for tools whose service facade
+  is not wired. Emits the `MCP_HANDLER_NOT_IMPLEMENTED` WARNING event so
+  operators can alert on unwired tools. Every tool registered today is
+  wired; this path fires only for newly registered tools that have not
+  been given a concrete handler.
 - `service_fallback(tool_name, reason)`: helper retained in `common.py`
   for future surgical use. Emits `MCP_HANDLER_SERVICE_FALLBACK`;
   META-MCP-2 removed every call site and the integration sweep at

@@ -134,6 +134,37 @@ intake strategy contracts.
 
 ---
 
+## Order of Operations
+
+The four quality and approval surfaces (verification stage, review
+pipeline, mid-execution `AUTH_REQUIRED` park, post-completion
+`IN_REVIEW` gate) operate at distinct points in the task lifecycle.
+
+| Phase | Surface | Trigger | Task status during | Exit | Where documented |
+|-------|---------|---------|--------------------|------|------------------|
+| Mid-execution | `AUTH_REQUIRED` park | Agent calls a tool that requires approval at runtime (e.g. `deploy`, `db:admin`). Driven by `ApprovalGate` middleware. | `AUTH_REQUIRED` | Approved: returns to `ASSIGNED`. Denied / timeout: `CANCELLED`. | [Security: Approval Workflow](security.md#approval-workflow) |
+| Agent done | Verification stage | Workflow blueprint has a `VERIFICATION` control-flow node. Runs as a separate evaluator agent with its own context. | `IN_PROGRESS` (engine-internal) | Pass: continue to next node. Fail: regenerate. Refer: hand to human via `VERIFICATION_REFER` edge. | This page, [Workflow Node and Edge Types](#workflow-node-and-edge-types) |
+| Agent done | Review pipeline | Task transitions `IN_PROGRESS` to `IN_REVIEW`. Chain of `ReviewStage` instances runs. | `IN_REVIEW` | First-failing stage returns the task to `IN_PROGRESS`; all-pass moves to `COMPLETED`. | This page, [Review Pipeline](#review-pipeline) |
+
+Key invariants:
+
+- `AUTH_REQUIRED` is the mid-execution park reason and uses the
+  `ApprovalGate` middleware in the agent harness. The review pipeline
+  is the post-completion quality gate and uses `ReviewGateService`.
+  The two are independent: a single task can encounter both (e.g.
+  pause for deploy approval mid-task, then enter `IN_REVIEW` once the
+  agent finishes).
+- The verification stage runs BEFORE the review pipeline when both
+  are configured for the same workflow. Verification is a workflow
+  blueprint construct (a node in the graph); the review pipeline
+  fires on the `IN_PROGRESS` to `IN_REVIEW` transition that happens
+  after the workflow's last node completes.
+- The review pipeline does not mint new `TaskStatus` values; the
+  task stays at `IN_REVIEW` throughout, with stage progress in
+  metadata.
+
+---
+
 ## See Also
 
 - [Task & Workflow Engine](engine.md): task dispatch, state coordination

@@ -93,11 +93,11 @@ Current behavior (`delete_agent` in `src/synthorg/api/services/_org_agent_mutati
 2. The agent record is removed from the active org configuration.
 3. A company snapshot is persisted and an `API_AGENT_DELETED` event is logged and broadcast on the `agents` WebSocket channel.
 
-Planned (not yet implemented): automated task reassignment via `TaskReassignmentStrategy`, memory archival via `MemoryArchivalStrategy`, selective promotion to `OrgMemoryBackend`, and an explicit `TERMINATED` lifecycle state. Until those land, fires are best paired with manual task reassignment before the DELETE call.
+Not yet wired into the DELETE flow: automated task reassignment via `TaskReassignmentStrategy` (concrete: `QueueReturnStrategy` in `src/synthorg/hr/queue_return_strategy.py`), memory archival via `MemoryArchivalStrategy` (concrete: `FullSnapshotStrategy` in `src/synthorg/hr/full_snapshot_strategy.py`), selective promotion to `OrgMemoryBackend`, and an explicit `TERMINATED` lifecycle state. The strategies exist as part of the offboarding-service shape but the API DELETE handler does not invoke them; fires are best paired with manual task reassignment before the DELETE call.
 
-## Rehiring from archive (planned)
+## Rehiring from archive
 
-A dedicated `POST /api/v1/agents/{agent_name}/rehire` endpoint (which would restore archived memory into a new identity with a fresh hire date and version chain) is **not yet implemented** in the agents controller. Until it ships, rehiring is a manual two-step: list archived agents via the existing listing, then recreate with `POST /api/v1/agents` using a fresh `CreateAgentOrgRequest` payload; memory restoration is performed out-of-band through the Memory Admin API. This planned surface sits alongside the same lifecycle automation called out in [Firing](#firing).
+A dedicated `POST /api/v1/agents/{agent_name}/rehire` endpoint (which would restore archived memory into a new identity with a fresh hire date and version chain) is not implemented in the agents controller. Rehiring is a manual two-step today: list archived agents via the existing listing, then recreate with `POST /api/v1/agents` using a fresh `CreateAgentOrgRequest` payload; memory restoration is performed out-of-band through the Memory Admin API. The endpoint sits alongside the same lifecycle automation called out in [Firing](#firing); track on the GitHub issue tracker.
 
 ## Lifecycle events (WebSocket)
 

@@ -275,7 +275,7 @@ curl "http://localhost:3001/api/v1/budget/records?task_id=${TASK_ID}" \
 
 The response includes `data` (paginated records), `daily_summary` (per-day totals aggregated across ALL matching records, not just the page), and `period_summary` (overall totals + computed `avg_cost`).
 
-Supported query parameters: `agent_id`, `task_id`, `offset`, `limit`. Additional slicing (by provider, model, tag, project, date range) is done client-side from the paginated response today; a dedicated report-generation endpoint is planned but not yet implemented.
+Supported query parameters: `agent_id`, `task_id`, `offset`, `limit`. Additional slicing (by provider, model, tag, project, date range) is done client-side from the paginated response. A dedicated report-generation endpoint with server-side slicing is tracked on the GitHub issue tracker.
 
 ### Budget alert webhook integration
 

@@ -188,6 +188,41 @@ Each provider lists its available models under the `models` key:
             alias: "medium"
     ```
 
+=== "Cross-provider Fallback"
+
+    Configure a secondary provider that takes over when the primary
+    rejects, rate-limits, or times out a request. The `degradation`
+    block on the primary names the fallback provider; agents resolve
+    the alias on the secondary when degradation fires.
+
+    ```yaml
+    providers:
+      primary-cloud:
+        auth_type: api_key
+        api_key: "sk-..."
+        degradation:
+          fallback_provider: secondary-cloud
+          trigger_on:
+            - rate_limit
+            - provider_timeout
+            - provider_connection
+        models:
+          - id: "example-large-001"
+            alias: "large"
+          - id: "example-small-001"
+            alias: "small"
+      secondary-cloud:
+        auth_type: api_key
+        api_key: "sk-backup-..."
+        models:
+          # Both providers expose the same alias names so the routing
+          # layer can hand off without reconfiguring agents.
+          - id: "alt-large-001"
+            alias: "large"
+          - id: "alt-small-001"
+            alias: "small"
+    ```
+
 ---
 
 ## Model Routing
@@ -207,7 +242,7 @@ The `routing` section controls how models are selected for agent tasks.
 
 ### Routing Rules
 
-Rules are evaluated in order. Each rule matches by `role_level` and/or `task_type`:
+Rules are evaluated in order. Each rule matches by `role_level` and / or `task_type`. **Validation: at least one of `role_level` or `task_type` MUST be set per rule.** A rule with both fields null is rejected at company-load time with a `ConfigValidationError`.
 
 ```yaml
 routing:
@@ -232,10 +267,6 @@ routing:
 | `preferred_model` | string | *(required)* | Preferred model alias or ID |
 | `fallback` | string | `null` | Fallback model |
 
-!!! note
-
-    At least one of `role_level` or `task_type` must be set per rule.
-
 ---
 
 ## Agents
@@ -301,20 +332,27 @@ Operational data persistence (tasks, cost records, messages, workflows, audit en
 persistence:
   backend: "postgres"              # "sqlite" (default) or "postgres"
   sqlite:
-    path: "/data/synthorg.db"    # file path; used when backend == "sqlite"
-  postgres:                        # used when backend == "postgres"
+    path: "/data/synthorg.db"      # file path; used when backend == "sqlite"
+    wal_mode: true                 # enable WAL journal mode (default)
+    journal_size_limit: 67108864   # WAL journal cap in bytes (default 64 MB)
+  postgres:                        # required when backend == "postgres"
     host: "db.internal"
-    port: 5432
+    port: 5432                     # default 5432
     database: "synthorg"
     username: "synthorg_app"
-    password: "${POSTGRES_PASSWORD}"  # SecretStr -- redacted from logs
-    ssl_mode: "verify-full"          # prefer verify-full in production
-    pool_min_size: 1
-    pool_max_size: 10
-    pool_timeout_seconds: 30.0
-    application_name: "synthorg"
-    statement_timeout_ms: 30000
-    connect_timeout_seconds: 10.0
+    password: "${POSTGRES_PASSWORD}"  # SecretStr; redacted from logs
+    ssl_mode: "verify-full"        # default "require"; prefer "verify-full" in prod
+    pool_min_size: 1               # default 1
+    pool_max_size: 10              # default 10; must be >= pool_min_size
+    pool_timeout_seconds: 30.0     # default 30.0
+    application_name: "synthorg"   # appears in pg_stat_activity
+    statement_timeout_ms: 30000    # default 30000 (0 disables)
+    connect_timeout_seconds: 10.0  # default 10.0
+    # TimescaleDB hypertable support (Apache-2.0 features only).
+    # Not available on managed Postgres providers (RDS, Cloud SQL, Azure).
+    enable_timescaledb: false
+    cost_records_chunk_interval: "1 day"
+    audit_entries_chunk_interval: "1 day"
 ```
 
 The Postgres backend requires the optional extra: install with

@@ -368,9 +368,9 @@ curl -X DELETE http://localhost:3001/api/v1/admin/memory/fine-tune/checkpoints/$
   -H "Cookie: ${SESSION}"
 ```
 
-### Planned admin endpoints
+### Admin endpoints on the backlog
 
-Consolidation, reindex, procedural-skill management, and organization-memory promotion are described in the [Memory design page](../design/memory.md#consolidation-and-retention) but are not yet exposed as REST endpoints. Track progress against the Memory roadmap; in the meantime these operations happen on agent-lifecycle boundaries (consolidation cycles, startup reindex, procedural-memory auto-generation).
+Consolidation, reindex, procedural-skill management, and organization-memory promotion are described in the [Memory design page](../design/memory.md#consolidation-and-retention) but are not exposed as REST endpoints today. These operations happen on agent-lifecycle boundaries (consolidation cycles, startup reindex, procedural-memory auto-generation); a dedicated admin surface is tracked on the GitHub issue tracker under the memory label.
 
 ---