Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ curl http://localhost:3000/api/v1/health # backend (via web proxy)

```text
src/synthorg/
api/ # Litestar REST + WebSocket API (controllers, guards, channels, JWT + API key auth, approval gate integration, coordination endpoint, collaboration endpoint, RFC 9457 structured errors (ErrorCategory, ErrorCode, ErrorDetail))
api/ # Litestar REST + WebSocket API (controllers, guards, channels, JWT + API key auth, approval gate integration, coordination endpoint, collaboration endpoint, RFC 9457 structured errors (ErrorCategory, ErrorCode, ErrorDetail, ProblemDetail, CATEGORY_TITLES, category_title, category_type_uri, content negotiation))
budget/ # Cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)
cli/ # Python CLI module (superseded by top-level cli/ Go binary)
communication/ # Message bus, dispatcher, messenger, channels, delegation, loop prevention, conflict resolution
Expand Down Expand Up @@ -191,7 +191,7 @@ site/ # Astro landing page (synthorg.io)
- **Every module** with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`
- **Never** use `import logging` / `logging.getLogger()` / `print()` in application code
- **Variable name**: always `logger` (not `_logger`, not `log`)
- **Event names**: always use constants from the domain-specific module under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`, `CFO_ANOMALY_DETECTED` from `events.cfo`, `CONFLICT_DETECTED` from `events.conflict`, `MEETING_STARTED` from `events.meeting`, `MEETING_SCHEDULER_STARTED` from `events.meeting`, `MEETING_SCHEDULER_ERROR` from `events.meeting`, `MEETING_SCHEDULER_STOPPED` from `events.meeting`, `MEETING_PERIODIC_TRIGGERED` from `events.meeting`, `MEETING_EVENT_TRIGGERED` from `events.meeting`, `MEETING_PARTICIPANTS_RESOLVED` from `events.meeting`, `MEETING_NO_PARTICIPANTS` from `events.meeting`, `MEETING_NOT_FOUND` from `events.meeting`, `CLASSIFICATION_START` from `events.classification`, `CONSOLIDATION_START` from `events.consolidation`, `ORG_MEMORY_QUERY_START` from `events.org_memory`, `API_REQUEST_STARTED` from `events.api`, `API_REQUEST_COMPLETED` from `events.api`, `API_REQUEST_ERROR` from `events.api`, `API_ROUTE_NOT_FOUND` from `events.api`, `API_HEALTH_CHECK` from `events.api`, `API_COORDINATION_STARTED` from `events.api`, `API_COORDINATION_COMPLETED` from `events.api`, `API_COORDINATION_FAILED` from `events.api`, `API_COORDINATION_AGENT_RESOLVE_FAILED` from `events.api`, `CODE_RUNNER_EXECUTE_START` from `events.code_runner`, `DOCKER_EXECUTE_START` from `events.docker`, `MCP_INVOKE_START` from `events.mcp`, `SECURITY_EVALUATE_START` from `events.security`, `HR_HIRING_REQUEST_CREATED` from `events.hr`, `PERF_METRIC_RECORDED` from `events.performance`, `PERF_LLM_SAMPLE_STARTED` from `events.performance`, `PERF_LLM_SAMPLE_COMPLETED` from `events.performance`, `PERF_LLM_SAMPLE_FAILED` from `events.performance`, `PERF_OVERRIDE_SET` from `events.performance`, `PERF_OVERRIDE_CLEARED` from `events.performance`, `PERF_OVERRIDE_APPLIED` from `events.performance`, `PERF_OVERRIDE_EXPIRED` from `events.performance`, `TRUST_EVALUATE_START` from `events.trust`, `PROMOTION_EVALUATE_START` from `events.promotion`, `PROMPT_BUILD_START` from `events.prompt`, `MEMORY_RETRIEVAL_START` from `events.memory`, `MEMORY_BACKEND_CONNECTED` from `events.memory`, `MEMORY_ENTRY_STORED` from `events.memory`, `MEMORY_BACKEND_SYSTEM_ERROR` from `events.memory`, `MEMORY_RRF_FUSION_COMPLETE` from `events.memory`, `MEMORY_RRF_VALIDATION_FAILED` from `events.memory`, `AUTONOMY_ACTION_AUTO_APPROVED` from `events.autonomy`, `TIMEOUT_POLICY_EVALUATED` from `events.timeout`, `PERSISTENCE_AUDIT_ENTRY_SAVED` from `events.persistence`, `TASK_ENGINE_STARTED` from `events.task_engine`, `COORDINATION_STARTED` from `events.coordination`, `COORDINATION_FACTORY_BUILT` from `events.coordination`, `COMMUNICATION_DISPATCH_START` from `events.communication`, `COMPANY_STARTED` from `events.company`, `CONFIG_LOADED` from `events.config`, `CORRELATION_ID_CREATED` from `events.correlation`, `DECOMPOSITION_STARTED` from `events.decomposition`, `DELEGATION_STARTED` from `events.delegation`, `EXECUTION_LOOP_START` from `events.execution`, `CHECKPOINT_SAVED` from `events.checkpoint`, `PERSISTENCE_CHECKPOINT_SAVED` from `events.persistence`, `GIT_OPERATION_START` from `events.git`, `PARALLEL_GROUP_START` from `events.parallel`, `PERSONALITY_LOADED` from `events.personality`, `QUOTA_CHECKED` from `events.quota`, `ROLE_ASSIGNED` from `events.role`, `ROUTING_STARTED` from `events.routing`, `SANDBOX_EXECUTE_START` from `events.sandbox`, `TASK_CREATED` from `events.task`, `TASK_ASSIGNMENT_STARTED` from `events.task_assignment`, `TASK_ROUTING_STARTED` from `events.task_routing`, `TEMPLATE_LOADED` from `events.template`, `TOOL_INVOKE_START` from `events.tool`, `TOOL_OUTPUT_WITHHELD` from `events.tool`, `WORKSPACE_CREATED` from `events.workspace`, `APPROVAL_GATE_ESCALATION_DETECTED` from `events.approval_gate`, `APPROVAL_GATE_ESCALATION_FAILED` from `events.approval_gate`, `APPROVAL_GATE_INITIALIZED` from `events.approval_gate`, `APPROVAL_GATE_RISK_CLASSIFIED` from `events.approval_gate`, `APPROVAL_GATE_RISK_CLASSIFY_FAILED` from `events.approval_gate`, `APPROVAL_GATE_CONTEXT_PARKED` from `events.approval_gate`, `APPROVAL_GATE_CONTEXT_PARK_FAILED` from `events.approval_gate`, `APPROVAL_GATE_PARK_TASKLESS` from `events.approval_gate`, `APPROVAL_GATE_RESUME_STARTED` from `events.approval_gate`, `APPROVAL_GATE_CONTEXT_RESUMED` from `events.approval_gate`, `APPROVAL_GATE_RESUME_FAILED` from `events.approval_gate`, `APPROVAL_GATE_RESUME_DELETE_FAILED` from `events.approval_gate`, `APPROVAL_GATE_RESUME_TRIGGERED` from `events.approval_gate`, `APPROVAL_GATE_NO_PARKED_CONTEXT` from `events.approval_gate`, `APPROVAL_GATE_LOOP_WIRING_WARNING` from `events.approval_gate`, `STAGNATION_CHECK_PERFORMED` from `events.stagnation`, `STAGNATION_DETECTED` from `events.stagnation`, `STAGNATION_CORRECTION_INJECTED` from `events.stagnation`, `STAGNATION_TERMINATED` from `events.stagnation`, `PERSISTENCE_AGENT_STATE_SAVED` from `events.persistence`, `PERSISTENCE_AGENT_STATE_FETCHED` from `events.persistence`, `PERSISTENCE_AGENT_STATE_ACTIVE_QUERIED` from `events.persistence`, `PERSISTENCE_AGENT_STATE_DELETED` from `events.persistence`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
- **Event names**: always use constants from the domain-specific module under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`, `CFO_ANOMALY_DETECTED` from `events.cfo`, `CONFLICT_DETECTED` from `events.conflict`, `MEETING_STARTED` from `events.meeting`, `MEETING_SCHEDULER_STARTED` from `events.meeting`, `MEETING_SCHEDULER_ERROR` from `events.meeting`, `MEETING_SCHEDULER_STOPPED` from `events.meeting`, `MEETING_PERIODIC_TRIGGERED` from `events.meeting`, `MEETING_EVENT_TRIGGERED` from `events.meeting`, `MEETING_PARTICIPANTS_RESOLVED` from `events.meeting`, `MEETING_NO_PARTICIPANTS` from `events.meeting`, `MEETING_NOT_FOUND` from `events.meeting`, `CLASSIFICATION_START` from `events.classification`, `CONSOLIDATION_START` from `events.consolidation`, `ORG_MEMORY_QUERY_START` from `events.org_memory`, `API_REQUEST_STARTED` from `events.api`, `API_REQUEST_COMPLETED` from `events.api`, `API_REQUEST_ERROR` from `events.api`, `API_ROUTE_NOT_FOUND` from `events.api`, `API_HEALTH_CHECK` from `events.api`, `API_COORDINATION_STARTED` from `events.api`, `API_COORDINATION_COMPLETED` from `events.api`, `API_COORDINATION_FAILED` from `events.api`, `API_COORDINATION_AGENT_RESOLVE_FAILED` from `events.api`, `API_CONTENT_NEGOTIATED` from `events.api`, `API_CORRELATION_FALLBACK` from `events.api`, `API_ACCEPT_PARSE_FAILED` from `events.api`, `CODE_RUNNER_EXECUTE_START` from `events.code_runner`, `DOCKER_EXECUTE_START` from `events.docker`, `MCP_INVOKE_START` from `events.mcp`, `SECURITY_EVALUATE_START` from `events.security`, `HR_HIRING_REQUEST_CREATED` from `events.hr`, `PERF_METRIC_RECORDED` from `events.performance`, `PERF_LLM_SAMPLE_STARTED` from `events.performance`, `PERF_LLM_SAMPLE_COMPLETED` from `events.performance`, `PERF_LLM_SAMPLE_FAILED` from `events.performance`, `PERF_OVERRIDE_SET` from `events.performance`, `PERF_OVERRIDE_CLEARED` from `events.performance`, `PERF_OVERRIDE_APPLIED` from `events.performance`, `PERF_OVERRIDE_EXPIRED` from `events.performance`, `TRUST_EVALUATE_START` from `events.trust`, `PROMOTION_EVALUATE_START` from `events.promotion`, `PROMPT_BUILD_START` from `events.prompt`, `MEMORY_RETRIEVAL_START` from `events.memory`, `MEMORY_BACKEND_CONNECTED` from `events.memory`, `MEMORY_ENTRY_STORED` from `events.memory`, `MEMORY_BACKEND_SYSTEM_ERROR` from `events.memory`, `MEMORY_RRF_FUSION_COMPLETE` from `events.memory`, `MEMORY_RRF_VALIDATION_FAILED` from `events.memory`, `AUTONOMY_ACTION_AUTO_APPROVED` from `events.autonomy`, `TIMEOUT_POLICY_EVALUATED` from `events.timeout`, `PERSISTENCE_AUDIT_ENTRY_SAVED` from `events.persistence`, `TASK_ENGINE_STARTED` from `events.task_engine`, `COORDINATION_STARTED` from `events.coordination`, `COORDINATION_FACTORY_BUILT` from `events.coordination`, `COMMUNICATION_DISPATCH_START` from `events.communication`, `COMPANY_STARTED` from `events.company`, `CONFIG_LOADED` from `events.config`, `CORRELATION_ID_CREATED` from `events.correlation`, `DECOMPOSITION_STARTED` from `events.decomposition`, `DELEGATION_STARTED` from `events.delegation`, `EXECUTION_LOOP_START` from `events.execution`, `CHECKPOINT_SAVED` from `events.checkpoint`, `PERSISTENCE_CHECKPOINT_SAVED` from `events.persistence`, `GIT_OPERATION_START` from `events.git`, `PARALLEL_GROUP_START` from `events.parallel`, `PERSONALITY_LOADED` from `events.personality`, `QUOTA_CHECKED` from `events.quota`, `ROLE_ASSIGNED` from `events.role`, `ROUTING_STARTED` from `events.routing`, `SANDBOX_EXECUTE_START` from `events.sandbox`, `TASK_CREATED` from `events.task`, `TASK_ASSIGNMENT_STARTED` from `events.task_assignment`, `TASK_ROUTING_STARTED` from `events.task_routing`, `TEMPLATE_LOADED` from `events.template`, `TOOL_INVOKE_START` from `events.tool`, `TOOL_OUTPUT_WITHHELD` from `events.tool`, `WORKSPACE_CREATED` from `events.workspace`, `APPROVAL_GATE_ESCALATION_DETECTED` from `events.approval_gate`, `APPROVAL_GATE_ESCALATION_FAILED` from `events.approval_gate`, `APPROVAL_GATE_INITIALIZED` from `events.approval_gate`, `APPROVAL_GATE_RISK_CLASSIFIED` from `events.approval_gate`, `APPROVAL_GATE_RISK_CLASSIFY_FAILED` from `events.approval_gate`, `APPROVAL_GATE_CONTEXT_PARKED` from `events.approval_gate`, `APPROVAL_GATE_CONTEXT_PARK_FAILED` from `events.approval_gate`, `APPROVAL_GATE_PARK_TASKLESS` from `events.approval_gate`, `APPROVAL_GATE_RESUME_STARTED` from `events.approval_gate`, `APPROVAL_GATE_CONTEXT_RESUMED` from `events.approval_gate`, `APPROVAL_GATE_RESUME_FAILED` from `events.approval_gate`, `APPROVAL_GATE_RESUME_DELETE_FAILED` from `events.approval_gate`, `APPROVAL_GATE_RESUME_TRIGGERED` from `events.approval_gate`, `APPROVAL_GATE_NO_PARKED_CONTEXT` from `events.approval_gate`, `APPROVAL_GATE_LOOP_WIRING_WARNING` from `events.approval_gate`, `STAGNATION_CHECK_PERFORMED` from `events.stagnation`, `STAGNATION_DETECTED` from `events.stagnation`, `STAGNATION_CORRECTION_INJECTED` from `events.stagnation`, `STAGNATION_TERMINATED` from `events.stagnation`, `PERSISTENCE_AGENT_STATE_SAVED` from `events.persistence`, `PERSISTENCE_AGENT_STATE_FETCHED` from `events.persistence`, `PERSISTENCE_AGENT_STATE_ACTIVE_QUERIED` from `events.persistence`, `PERSISTENCE_AGENT_STATE_DELETED` from `events.persistence`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
- **Structured kwargs**: always `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`
- **All error paths** must log at WARNING or ERROR with context before raising
- **All state transitions** must log at INFO
Expand Down
43 changes: 36 additions & 7 deletions docs/design/operations.md
Original file line number Diff line number Diff line change
Expand Up @@ -976,24 +976,53 @@ future CLI tool are thin clients that call the API -- they contain no business l
| `/api/v1/providers` | Model provider status, config |
| `/api/v1/ws` | WebSocket for real-time updates |

### Error Response Format (RFC 9457 Phase 1)
### Error Response Format (RFC 9457)

All error responses include structured metadata for machine consumption alongside
the existing human-readable `error` field. The `error_detail` object is present on
every error response and contains:
All error responses follow [RFC 9457 (Problem Details for HTTP APIs)](https://www.rfc-editor.org/rfc/rfc9457).
The API supports two response formats via content negotiation:

- **Default (`application/json`)**: `ApiResponse` envelope with `error_detail` object
- **RFC 9457 bare (`application/problem+json`)**: Flat `ProblemDetail` body with `Content-Type: application/problem+json`

Clients request bare RFC 9457 responses by sending `Accept: application/problem+json`.

#### ErrorDetail Fields (Envelope Format)

The `error_detail` object in the envelope contains:

| Field | Type | Description |
|-------|------|-------------|
| `message` | `str` | Human-readable error message (matches top-level `error`) |
| `detail` | `str` | Human-readable occurrence-specific explanation |
| `error_code` | `int` | Machine-readable 4-digit code (category-grouped: 1xxx=auth, 2xxx=validation, 3xxx=not_found, 4xxx=conflict, 5xxx=rate_limit, 6xxx=budget_exhausted, 7xxx=provider_error, 8xxx=internal) |
| `error_category` | `str` | High-level category: `auth`, `validation`, `not_found`, `conflict`, `rate_limit`, `budget_exhausted`, `provider_error`, `internal` |
| `retryable` | `bool` | Whether the client should retry the request |
| `retry_after` | `int \| null` | Seconds to wait before retrying (null when not applicable) |
| `instance` | `str` | Request correlation ID for log tracing |
| `title` | `str` | Static per-category title (e.g., "Authentication Error") |
| `type` | `str` | Documentation URI for the error category (e.g., `https://synthorg.io/docs/errors#auth`) |

#### ProblemDetail Fields (RFC 9457 Bare Format)

When `Accept: application/problem+json`, the response body contains:

| Field | Type | Description |
|-------|------|-------------|
| `type` | `str` | Documentation URI for the error category |
| `title` | `str` | Static per-category title |
| `status` | `int` | HTTP status code |
| `detail` | `str` | Human-readable occurrence-specific explanation |
| `instance` | `str` | Request correlation ID for log tracing |
| `error_code` | `int` | Machine-readable 4-digit error code |
| `error_category` | `str` | High-level error category |
| `retryable` | `bool` | Whether the client should retry |
| `retry_after` | `int \| null` | Seconds to wait before retrying |

Agent consumers can use `retryable` and `retry_after` for autonomous retry logic,
and `error_code` / `error_category` for programmatic error handling without parsing
message strings.
`error_code` / `error_category` for programmatic error handling without parsing
message strings, and `type` URIs for documentation lookup.

See the [Error Reference](../errors.md) for the full error taxonomy, code list,
and retry guidance.

### Web UI Features

Expand Down
Loading
Loading