Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
2453285
feat: add approval workflow gates to TaskEngine (#258, #259)
Aureliolo Mar 13, 2026
5c08597
fix: address review findings across approval workflow
Aureliolo Mar 13, 2026
d4c62c5
fix: address 41 review findings across approval workflow
Aureliolo Mar 13, 2026
2454b2a
fix: resolve mypy no-any-return in test fixture
Aureliolo Mar 13, 2026
90877fd
fix: address Greptile findings in resume trigger
Aureliolo Mar 14, 2026
59fb102
fix: address PR #368 reviewer findings and boost test coverage
Aureliolo Mar 14, 2026
e6ff984
fix: address round-2 reviewer findings from PR #368
Aureliolo Mar 14, 2026
24e53df
fix: address Greptile round-3 findings in approval gate
Aureliolo Mar 14, 2026
228fedd
fix: address Copilot round-3 findings
Aureliolo Mar 14, 2026
1d0166b
fix: mypy CI failure + minor defensive improvements
Aureliolo Mar 14, 2026
acd6e17
fix: remove double park-failure log in loop_helpers
Aureliolo Mar 14, 2026
c794c11
fix: prevent ToolExecutionError escape in invoke_all
Aureliolo Mar 14, 2026
cc65e53
fix: update V6 schema version test to V7 after rebase
Aureliolo Mar 14, 2026
d71ab67
fix: restore CLAUDE.md, crash-safe V7 migration, invoker guard
Aureliolo Mar 14, 2026
58dcc2a
fix: address CodeRabbit round-4 findings
Aureliolo Mar 14, 2026
9b0638e
fix: V7 migration data loss window and minor test hardening
Aureliolo Mar 14, 2026
9b97a4a
fix: escalation action_type from metadata + V7 step-4 crash safety
Aureliolo Mar 14, 2026
b36b9ca
fix: strip input normalization, hide internal errors, test hardening
Aureliolo Mar 14, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,21 +98,21 @@ curl http://localhost:3000/api/v1/health # backend (via web proxy)

```text
src/ai_company/
api/ # Litestar REST + WebSocket API (controllers, guards, channels, JWT + API key auth)
api/ # Litestar REST + WebSocket API (controllers, guards, channels, JWT + API key auth, approval gate integration)
budget/ # Cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)
cli/ # CLI interface (future — thin API wrapper if needed)
communication/ # Message bus, dispatcher, messenger, channels, delegation, loop prevention, conflict resolution, meeting protocol
config/ # YAML company config loading and validation
core/ # Shared domain models, base classes, and resilience config (RetryConfig, RateLimiterConfig)
engine/ # Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy)
engine/ # Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models)
hr/ # HR engine: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)
memory/ # Persistent agent memory (pluggable MemoryBackend protocol), backends/ (Mem0 adapter: backends/mem0/), retrieval pipeline (ranking, injection, context formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (consolidation/)
persistence/ # Operational data persistence — pluggable PersistenceBackend protocol, SQLite initial (see Memory & Persistence design page)
observability/ # Structured logging, correlation tracking, log sinks
providers/ # LLM provider abstraction (LiteLLM adapter)
security/ # SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
templates/ # Pre-built company templates, personality presets, and builder
tools/ # Tool registry, built-in tools (file_system/, git, sandbox/, code_runner), MCP bridge (mcp/), role-based access
tools/ # Tool registry, built-in tools (file_system/, git, sandbox/, code_runner), MCP bridge (mcp/), role-based access, approval tool (request_human_approval)

web/ # Vue 3 + PrimeVue + Tailwind CSS dashboard
src/
Expand Down Expand Up @@ -151,7 +151,7 @@ web/ # Vue 3 + PrimeVue + Tailwind CSS dashboard
- **Every module** with business logic MUST have: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`
- **Never** use `import logging` / `logging.getLogger()` / `print()` in application code
- **Variable name**: always `logger` (not `_logger`, not `log`)
- **Event names**: always use constants from the domain-specific module under `ai_company.observability.events` (e.g. `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`, `CFO_ANOMALY_DETECTED` from `events.cfo`, `CONFLICT_DETECTED` from `events.conflict`, `MEETING_STARTED` from `events.meeting`, `CLASSIFICATION_START` from `events.classification`, `CONSOLIDATION_START` from `events.consolidation`, `ORG_MEMORY_QUERY_START` from `events.org_memory`, `API_REQUEST_STARTED` from `events.api`, `API_ROUTE_NOT_FOUND` from `events.api`, `CODE_RUNNER_EXECUTE_START` from `events.code_runner`, `DOCKER_EXECUTE_START` from `events.docker`, `MCP_INVOKE_START` from `events.mcp`, `SECURITY_EVALUATE_START` from `events.security`, `HR_HIRING_REQUEST_CREATED` from `events.hr`, `PERF_METRIC_RECORDED` from `events.performance`, `TRUST_EVALUATE_START` from `events.trust`, `PROMOTION_EVALUATE_START` from `events.promotion`, `PROMPT_BUILD_START` from `events.prompt`, `MEMORY_RETRIEVAL_START` from `events.memory`, `MEMORY_BACKEND_CONNECTED` from `events.memory`, `MEMORY_ENTRY_STORED` from `events.memory`, `MEMORY_BACKEND_SYSTEM_ERROR` from `events.memory`, `AUTONOMY_ACTION_AUTO_APPROVED` from `events.autonomy`, `TIMEOUT_POLICY_EVALUATED` from `events.timeout`, `PERSISTENCE_AUDIT_ENTRY_SAVED` from `events.persistence`, `TASK_ENGINE_STARTED` from `events.task_engine`, `COORDINATION_STARTED` from `events.coordination`, `COMMUNICATION_DISPATCH_START` from `events.communication`, `COMPANY_STARTED` from `events.company`, `CONFIG_LOADED` from `events.config`, `CORRELATION_ID_CREATED` from `events.correlation`, `DECOMPOSITION_STARTED` from `events.decomposition`, `DELEGATION_STARTED` from `events.delegation`, `EXECUTION_LOOP_START` from `events.execution`, `CHECKPOINT_SAVED` from `events.checkpoint`, `PERSISTENCE_CHECKPOINT_SAVED` from `events.persistence`, `GIT_OPERATION_START` from `events.git`, `PARALLEL_GROUP_START` from `events.parallel`, `PERSONALITY_LOADED` from `events.personality`, `QUOTA_CHECKED` from `events.quota`, `ROLE_ASSIGNED` from `events.role`, `ROUTING_STARTED` from `events.routing`, `SANDBOX_EXECUTE_START` from `events.sandbox`, `TASK_CREATED` from `events.task`, `TASK_ASSIGNMENT_STARTED` from `events.task_assignment`, `TASK_ROUTING_STARTED` from `events.task_routing`, `TEMPLATE_LOADED` from `events.template`, `TOOL_INVOKE_START` from `events.tool`, `WORKSPACE_CREATED` from `events.workspace`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`
- **Event names**: always use constants from the domain-specific module under `ai_company.observability.events` (e.g. `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`, `CFO_ANOMALY_DETECTED` from `events.cfo`, `CONFLICT_DETECTED` from `events.conflict`, `MEETING_STARTED` from `events.meeting`, `CLASSIFICATION_START` from `events.classification`, `CONSOLIDATION_START` from `events.consolidation`, `ORG_MEMORY_QUERY_START` from `events.org_memory`, `API_REQUEST_STARTED` from `events.api`, `API_ROUTE_NOT_FOUND` from `events.api`, `CODE_RUNNER_EXECUTE_START` from `events.code_runner`, `DOCKER_EXECUTE_START` from `events.docker`, `MCP_INVOKE_START` from `events.mcp`, `SECURITY_EVALUATE_START` from `events.security`, `HR_HIRING_REQUEST_CREATED` from `events.hr`, `PERF_METRIC_RECORDED` from `events.performance`, `TRUST_EVALUATE_START` from `events.trust`, `PROMOTION_EVALUATE_START` from `events.promotion`, `PROMPT_BUILD_START` from `events.prompt`, `MEMORY_RETRIEVAL_START` from `events.memory`, `MEMORY_BACKEND_CONNECTED` from `events.memory`, `MEMORY_ENTRY_STORED` from `events.memory`, `MEMORY_BACKEND_SYSTEM_ERROR` from `events.memory`, `AUTONOMY_ACTION_AUTO_APPROVED` from `events.autonomy`, `TIMEOUT_POLICY_EVALUATED` from `events.timeout`, `PERSISTENCE_AUDIT_ENTRY_SAVED` from `events.persistence`, `TASK_ENGINE_STARTED` from `events.task_engine`, `COORDINATION_STARTED` from `events.coordination`, `COMMUNICATION_DISPATCH_START` from `events.communication`, `COMPANY_STARTED` from `events.company`, `CONFIG_LOADED` from `events.config`, `CORRELATION_ID_CREATED` from `events.correlation`, `DECOMPOSITION_STARTED` from `events.decomposition`, `DELEGATION_STARTED` from `events.delegation`, `EXECUTION_LOOP_START` from `events.execution`, `CHECKPOINT_SAVED` from `events.checkpoint`, `PERSISTENCE_CHECKPOINT_SAVED` from `events.persistence`, `GIT_OPERATION_START` from `events.git`, `PARALLEL_GROUP_START` from `events.parallel`, `PERSONALITY_LOADED` from `events.personality`, `QUOTA_CHECKED` from `events.quota`, `ROLE_ASSIGNED` from `events.role`, `ROUTING_STARTED` from `events.routing`, `SANDBOX_EXECUTE_START` from `events.sandbox`, `TASK_CREATED` from `events.task`, `TASK_ASSIGNMENT_STARTED` from `events.task_assignment`, `TASK_ROUTING_STARTED` from `events.task_routing`, `TEMPLATE_LOADED` from `events.template`, `TOOL_INVOKE_START` from `events.tool`, `WORKSPACE_CREATED` from `events.workspace`, `APPROVAL_GATE_ESCALATION_DETECTED` from `events.approval_gate`, `APPROVAL_GATE_ESCALATION_FAILED` from `events.approval_gate`, `APPROVAL_GATE_INITIALIZED` from `events.approval_gate`, `APPROVAL_GATE_RISK_CLASSIFIED` from `events.approval_gate`, `APPROVAL_GATE_RISK_CLASSIFY_FAILED` from `events.approval_gate`, `APPROVAL_GATE_CONTEXT_PARKED` from `events.approval_gate`, `APPROVAL_GATE_CONTEXT_PARK_FAILED` from `events.approval_gate`, `APPROVAL_GATE_PARK_TASKLESS` from `events.approval_gate`, `APPROVAL_GATE_RESUME_STARTED` from `events.approval_gate`, `APPROVAL_GATE_CONTEXT_RESUMED` from `events.approval_gate`, `APPROVAL_GATE_RESUME_FAILED` from `events.approval_gate`, `APPROVAL_GATE_RESUME_DELETE_FAILED` from `events.approval_gate`, `APPROVAL_GATE_RESUME_TRIGGERED` from `events.approval_gate`, `APPROVAL_GATE_NO_PARKED_CONTEXT` from `events.approval_gate`, `APPROVAL_GATE_LOOP_WIRING_WARNING` from `events.approval_gate`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix the punctuation typo in the logging guidance line.

Line 154 has a comma omission in the long sentence (as flagged by LanguageTool). Please add the missing comma to keep docs lint/style clean.

🧰 Tools
🪛 LanguageTool

[style] ~154-~154: A comma is missing here.
Context: ...nder ai_company.observability.events (e.g. PROVIDER_CALL_START from `events.prov...

(EG_NO_COMMA)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CLAUDE.md` at line 154, Fix the punctuation typo in the Event names guidance
line in CLAUDE.md by adding the missing comma in the final clause; edit the
sentence that currently ends with "Import directly: `from
ai_company.observability.events.<domain> import EVENT_CONSTANT`" so it includes
the needed comma (e.g., "Import directly, `from
ai_company.observability.events.<domain> import EVENT_CONSTANT`") to satisfy
lint/style.

- **Structured kwargs**: always `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`
- **All error paths** must log at WARNING or ERROR with context before raising
- **All state transitions** must log at INFO
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ graph TB

## Status

Core framework complete — agent engine, multi-agent coordination, API, security, HR, memory (including Mem0 backend adapter), and budget systems are implemented. Web dashboard (Vue 3 + PrimeVue + Tailwind CSS) is built. Remaining: approval workflow gates, CLI. See the [roadmap](docs/roadmap/index.md) for details.
Core framework complete — agent engine, multi-agent coordination, API, security, HR, memory (including Mem0 backend adapter), budget systems, and approval workflow gates are implemented. Web dashboard (Vue 3 + PrimeVue + Tailwind CSS) is built. Remaining: CLI and approval resume scheduler (approvals can park agents; re-enqueue after decision requires a future scheduler). See the [roadmap](docs/roadmap/index.md) for details.

## License

Expand Down
5 changes: 5 additions & 0 deletions docs/design/engine.md
Original file line number Diff line number Diff line change
Expand Up @@ -447,6 +447,11 @@ async run(
`timeout_seconds` is set, wraps the call in `asyncio.wait`; on expiry
the run returns with `TerminationReason.ERROR` but cost recording and
post-execution processing still occur.
When escalations are detected after tool execution (via
`ToolInvoker.pending_escalations`), the `ApprovalGate` evaluates whether
parking is needed. If so, the context is serialized via `ParkService`
and persisted when a `ParkedContextRepository` is configured; the loop
then returns a `PARKED` result.
9. **Record costs** -- records accumulated `TokenUsage` to `CostTracker` (if
available). Cost recording failures are logged but do not affect the result.
10. **Apply post-execution transitions:**
Expand Down
37 changes: 36 additions & 1 deletion src/ai_company/api/approval_store.py
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,32 @@ async def save(self, item: ApprovalItem) -> ApprovalItem | None:
self._items[item.id] = item
return item

async def save_if_pending(
self,
item: ApprovalItem,
) -> ApprovalItem | None:
"""Conditionally update an approval item if it is still pending.

A lazy expiration check is applied before comparing status.

Returns:
The saved item on success, or ``None`` if:

* no item with the given ID exists in the store,
* the stored item has expired, or
* the stored item is no longer ``PENDING`` (e.g. a
concurrent decision was made).
"""
current = self._items.get(item.id)
if current is None:
return None
# Apply lazy expiration check before comparing status.
current = self._check_expiration(current)
if current.status != ApprovalStatus.PENDING:
return None
self._items[item.id] = item
return item

def _check_expiration(self, item: ApprovalItem) -> ApprovalItem:
"""Lazily expire a pending item past its ``expires_at``.

Expand All @@ -148,6 +174,15 @@ def _check_expiration(self, item: ApprovalItem) -> ApprovalItem:
approval_id=item.id,
)
if self._on_expire is not None:
self._on_expire(expired)
try:
self._on_expire(expired)
except MemoryError, RecursionError:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This except syntax is for Python 2. For Python 3, you should use except (MemoryError, RecursionError):. This will cause a SyntaxError in Python 3.

Suggested change
except MemoryError, RecursionError:
except (MemoryError, RecursionError):

raise
except Exception:
logger.exception(
API_APPROVAL_EXPIRED,
approval_id=item.id,
note="on_expire callback failed",
)
Comment on lines +178 to +186
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lazy expiration not written back to _items inside save_if_pending

When _check_expiration determines the stored item has expired, it returns an expired copy but does not update self._items. After save_if_pending returns None, self._items[item.id] still holds the original PENDING item. Any subsequent call to get() (or another save_if_pending) would retrieve the stale PENDING entry and re-run the expiry check again, but the expired on_expire callback would fire a second time.

If get() also applies lazy expiration this is harmless. If it doesn't, callers that check the status after a failed save_if_pending could observe an inconsistent PENDING state.

Consider writing back the expired record when expiration fires:

current = self._check_expiration(current)
if current.status != ApprovalStatus.PENDING:
    # Write back the expired state so subsequent reads are consistent.
    self._items[item.id] = current
    return None
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/api/approval_store.py
Line: 178-186

Comment:
**Lazy expiration not written back to `_items` inside `save_if_pending`**

When `_check_expiration` determines the stored item has expired, it returns an expired copy but does **not** update `self._items`. After `save_if_pending` returns `None`, `self._items[item.id]` still holds the original `PENDING` item. Any subsequent call to `get()` (or another `save_if_pending`) would retrieve the stale PENDING entry and re-run the expiry check again, but the expired `on_expire` callback would fire a second time.

If `get()` also applies lazy expiration this is harmless. If it doesn't, callers that check the status after a failed `save_if_pending` could observe an inconsistent `PENDING` state.

Consider writing back the expired record when expiration fires:

```python
current = self._check_expiration(current)
if current.status != ApprovalStatus.PENDING:
    # Write back the expired state so subsequent reads are consistent.
    self._items[item.id] = current
    return None
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +177 to +186
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if MemoryError is imported or shadowed in this file
rg -n "import.*MemoryError|from.*import.*MemoryError" src/ai_company/api/approval_store.py

# Check if there's a pattern of catching built-in MemoryError elsewhere
rg -nC2 "except.*MemoryError" src/ai_company --type py | head -40

Repository: Aureliolo/synthorg

Length of output: 2521


🏁 Script executed:

#!/bin/bash
# Get the imports at the top of approval_store.py
head -50 src/ai_company/api/approval_store.py

# Check if ai_company.memory.errors is imported anywhere
rg "from ai_company.memory.errors|import.*MemoryError" src/ai_company/api/approval_store.py

# Get the full context of the method containing the suspect code
sed -n '160,190p' src/ai_company/api/approval_store.py

Repository: Aureliolo/synthorg

Length of output: 2396


🏁 Script executed:

#!/bin/bash
# Verify if ai_company.memory.errors.MemoryError exists in codebase
find src -name "*.py" -exec grep -l "class MemoryError" {} \;

# Check ai_company.memory.errors module
fd "memory" src/ai_company --type d

Repository: Aureliolo/synthorg

Length of output: 116


🏁 Script executed:

#!/bin/bash
# Check if MemoryError is actually defined in ai_company.memory.errors
cat src/ai_company/memory/errors.py

Repository: Aureliolo/synthorg

Length of output: 1471


🏁 Script executed:

#!/bin/bash
# Check if approval_store.py imports from ai_company.memory.errors
rg "from ai_company.memory|import.*MemoryError" src/ai_company/api/approval_store.py

# Check the complete import section of approval_store.py
sed -n '1,30p' src/ai_company/api/approval_store.py

Repository: Aureliolo/synthorg

Length of output: 877


Add logging before re-raising MemoryError and RecursionError errors.

The exception handler re-raises MemoryError and RecursionError without logging context first. Per guidelines, all error paths must log at WARNING or ERROR with context before raising.

try:
    self._on_expire(expired)
except MemoryError, RecursionError:
    raise
except Exception:
    logger.exception(
        API_APPROVAL_EXPIRED,
        approval_id=item.id,
        note="on_expire callback failed",
    )

Log before the re-raise on line 179:

except MemoryError, RecursionError:
    logger.error(
        API_APPROVAL_EXPIRED,
        approval_id=item.id,
        note="on_expire callback failed - non-recoverable error",
    )
    raise
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/api/approval_store.py` around lines 177 - 186, The except
block that re-raises MemoryError and RecursionError should log context before
re-raising: in the try around self._on_expire(expired) catch the non-recoverable
errors as a tuple (except (MemoryError, RecursionError):), call
logger.error(API_APPROVAL_EXPIRED, approval_id=item.id, note="on_expire callback
failed - non-recoverable error") and then re-raise; keep the existing broad
except Exception block that logs via logger.exception unchanged.

return expired
return item
Loading
Loading