Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 46 additions & 18 deletions DESIGN_SPEC.md
Original file line number Diff line number Diff line change
Expand Up @@ -693,29 +693,35 @@ structured_phases:
│ CREATED │
└─────┬─────┘
│ assignment
┌─────▼─────┐
┌──────│ ASSIGNED │
│ └─────┬─────┘
│ │ agent starts
┌─────▼─────┐ ┌──────────┐
┌──────│ ASSIGNED │──────────▶│ FAILED │
│ └─────┬─────┘◀───┐ └────┬─────┘
│ │ starts │ reassign │
│ ┌─────▼─────┐ │ ┌────▼─────┐
│ │IN_PROGRESS │───┼─────▶│ (retry) │
│ └─────┬─────┘ │ └──────────┘
│ │ ◀── (rework)
│ │ agent done
│ ┌─────▼─────┐
│ │IN_PROGRESS │◀──── (rework)
│ └─────┬─────┘ │
│ │ agent done │
│ ┌─────▼─────┐ │
│ │ IN_REVIEW │───────┘
│ │ IN_REVIEW │
│ └─────┬─────┘
│ │ approved
│ ┌─────▼─────┐
│ │ COMPLETED │
│ └────────────┘
│ blocked / cancelled
┌─────▼─────┐
│ BLOCKED / │
│ CANCELLED │
└────────────┘
│ blocked cancelled
┌─────▼─────┐ ┌────────────┐
│ BLOCKED │ │ CANCELLED │
└─────┬─────┘ └────────────┘
│ unblocked (terminal)
└──▶ ASSIGNED
```

> **Non-terminal states:** BLOCKED and FAILED are non-terminal — BLOCKED returns to ASSIGNED when unblocked, FAILED returns to ASSIGNED for retry (see §6.6). COMPLETED and CANCELLED are terminal states with no outgoing transitions.
>
> **Transitions into FAILED:** Both `ASSIGNED → FAILED` (early setup failures) and `IN_PROGRESS → FAILED` (runtime crashes) are valid. `FAILED → ASSIGNED` enables reassignment when `retry_count < max_retries`.

Comment on lines +721 to +724
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove the blank line inside this blockquote.

Line 721 triggers markdownlint MD028 (no-blanks-blockquote). Keep the blockquote contiguous to avoid the lint failure.

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 721-721: Blank line inside blockquote

(MD028, no-blanks-blockquote)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@DESIGN_SPEC.md` around lines 720 - 721, The blockquote containing
"**Non-terminal states:** BLOCKED and FAILED are non-terminal — BLOCKED returns
to ASSIGNED when unblocked, FAILED returns to ASSIGNED for retry (see §6.6).
COMPLETED and CANCELLED are terminal states with no outgoing transitions."
contains an extra blank line; remove that blank line so the blockquote lines are
contiguous (no empty line inside the quote) to satisfy markdownlint MD028.

> **Runtime wrapper (M3):** During execution, `Task` is wrapped by `TaskExecution` (in `engine/task_execution.py`). `TaskExecution` is a frozen Pydantic model that tracks status transitions via `model_copy(update=...)`, accumulates `TokenUsage` cost, and records a `StatusTransition` audit trail. The original `Task` is preserved unchanged; `to_task_snapshot()` produces a `Task` copy with the current execution status for persistence.

### 6.2 Task Definition
Expand Down Expand Up @@ -748,6 +754,7 @@ task:
task_structure: "parallel" # sequential, parallel, mixed (M4 — see §6.9)
budget_limit: 2.00 # max USD for this task
deadline: null
max_retries: 1 # max reassignment attempts after failure (0 = no retry)
status: "assigned"
```

Expand Down Expand Up @@ -952,11 +959,28 @@ When an agent execution fails unexpectedly (unhandled exception, OOM, process ki

> **MVP: Fail-and-Reassign only (Strategy 1).** Checkpoint Recovery is M4/M5.

**`RecoveryStrategy` protocol:**

| Method | Signature | Description |
|--------|-----------|-------------|
| `recover` | `async def recover(*, task_execution: TaskExecution, error_message: str, context: AgentContext) -> RecoveryResult` | Apply recovery to a failed task execution |
| `get_strategy_type` | `def get_strategy_type() -> str` | Return strategy type identifier (must not be empty) |

**`RecoveryResult` model (frozen):**

| Field | Type | Description |
|-------|------|-------------|
| `task_execution` | `TaskExecution` | Updated execution after recovery (typically `FAILED`) |
| `strategy_type` | `NotBlankStr` | Strategy identifier |
| `context_snapshot` | `AgentContextSnapshot` | Redacted snapshot (turn count, accumulated cost, message count, max turns — no message contents) |
| `error_message` | `NotBlankStr` | Error that triggered recovery |
| `can_reassign` | `bool` (computed) | `retry_count < task.max_retries` |

#### Strategy 1: Fail-and-Reassign (Default / MVP)

The engine catches the failure at its outermost boundary, logs a redacted `AgentContext` snapshot (turn count, accumulated cost — excluding message contents to avoid leaking sensitive prompts/tool outputs), transitions the task to `FAILED`, and makes it available for reassignment (manual or automatic via the task router).

> **New non-terminal state:** `FAILED` is a new `TaskStatus` variant to be added alongside `CANCELLED`. The §6.1 lifecycle diagram and `TaskStatus` enum will be updated when crash recovery is implemented in M3. `FAILED` differs from `CANCELLED` (which is terminal) in that failed tasks are eligible for automatic reassignment.
> **Non-terminal state (implemented in M3):** `FAILED` is a `TaskStatus` variant alongside `CANCELLED`. `FAILED` differs from `CANCELLED` (which is terminal) in that failed tasks are eligible for automatic reassignment. Valid transitions: `IN_PROGRESS → FAILED`, `ASSIGNED → FAILED` (early setup failures), `FAILED → ASSIGNED` (reassignment). See the updated §6.1 lifecycle diagram.

```yaml
crash_recovery:
Expand All @@ -967,10 +991,12 @@ crash_recovery:
- All progress is lost on crash — acceptable for short single-agent tasks in the MVP

On crash:
1. Catch exception at the engine boundary (outermost `try/except` in the execution loop)
2. Log at ERROR with redacted `AgentContext` snapshot (turn count, accumulated cost, tool call history — message contents excluded)
1. Catch exception at the `AgentEngine` boundary (outermost `try/except` in `AgentEngine.run()`)
2. Log at ERROR with redacted `AgentContextSnapshot` (turn count, accumulated cost, message count, max turns — message contents excluded)
3. Transition `TaskExecution` → `FAILED` with the exception as the failure reason
4. Task becomes available for reassignment via the task router
4. `RecoveryResult.can_reassign` reports whether `retry_count < max_retries`

> **M3 limitation:** The `can_reassign` flag is computed and returned in `RecoveryResult`, but automated reassignment is not yet implemented — the task router (§6.4) will consume this in a later milestone. The caller (task router) is responsible for incrementing `retry_count` when creating the next `TaskExecution`.

#### Strategy 2: Checkpoint Recovery (Planned — M4/M5)

Expand Down Expand Up @@ -2272,6 +2298,8 @@ ai-company/
│ │ ├── loop_protocol.py # ExecutionLoop protocol + result models
│ │ ├── metrics.py # TaskCompletionMetrics proxy overhead model
│ │ ├── react_loop.py # ReAct loop implementation
│ │ ├── recovery.py # Crash recovery strategies (RecoveryStrategy protocol)
│ │ ├── cost_recording.py # Per-turn cost recording helpers
│ │ ├── run_result.py # AgentRunResult outcome model
│ │ ├── agent_engine.py # Agent execution engine
Comment on lines +2301 to 2304
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update the run_result.py description to include RecoveryResult.

The new recovery.py entry is documented here, but run_result.py is still described only as AgentRunResult outcome model. With this PR adding RecoveryResult there as well, the project-structure map is now stale.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@DESIGN_SPEC.md` around lines 2281 - 2283, Update the project-structure map
entry for run_result.py to reflect the added RecoveryResult type: change the
short description from "AgentRunResult outcome model" to something like
"AgentRunResult and RecoveryResult outcome models" so it documents both
AgentRunResult and RecoveryResult in run_result.py; ensure you reference
run_result.py and the symbols AgentRunResult and RecoveryResult in the updated
line.

│ │ ├── task_engine.py # Task routing & scheduling (M3-M4)
Expand Down
7 changes: 5 additions & 2 deletions src/ai_company/core/enums.py
Original file line number Diff line number Diff line change
Expand Up @@ -127,11 +127,13 @@ class TaskStatus(StrEnum):
Summary for quick reference:

CREATED -> ASSIGNED
ASSIGNED -> IN_PROGRESS | BLOCKED | CANCELLED
IN_PROGRESS -> IN_REVIEW | BLOCKED | CANCELLED
ASSIGNED -> IN_PROGRESS | BLOCKED | CANCELLED | FAILED
IN_PROGRESS -> IN_REVIEW | BLOCKED | CANCELLED | FAILED
IN_REVIEW -> COMPLETED | IN_PROGRESS (rework) | BLOCKED | CANCELLED
BLOCKED -> ASSIGNED (unblocked)
FAILED -> ASSIGNED (reassignment for retry)
COMPLETED and CANCELLED are terminal states.
FAILED is non-terminal (can be reassigned).
"""

CREATED = "created"
Expand All @@ -140,6 +142,7 @@ class TaskStatus(StrEnum):
IN_REVIEW = "in_review"
COMPLETED = "completed"
BLOCKED = "blocked"
FAILED = "failed"
CANCELLED = "cancelled"


Expand Down
10 changes: 8 additions & 2 deletions src/ai_company/core/task.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ class Task(BaseModel):
estimated_complexity: Task complexity estimate.
budget_limit: Maximum USD spend for this task.
deadline: Optional deadline (ISO 8601 string or ``None``).
max_retries: Max reassignment attempts after failure (default 1).
status: Current lifecycle status.
"""

Expand Down Expand Up @@ -112,6 +113,11 @@ class Task(BaseModel):
default=None,
description="Optional deadline (ISO 8601 string)",
)
max_retries: int = Field(
default=1,
ge=0,
description="Max reassignment attempts after failure",
)
status: TaskStatus = Field(
default=TaskStatus.CREATED,
description="Current lifecycle status",
Expand Down Expand Up @@ -153,8 +159,8 @@ def _validate_assignment_consistency(self) -> Self:

``CREATED`` status must have ``assigned_to=None``. Statuses beyond
``CREATED`` (``ASSIGNED``, ``IN_PROGRESS``, ``IN_REVIEW``,
``COMPLETED``) require ``assigned_to`` to be set. ``BLOCKED``
and ``CANCELLED`` may or may not have an assignee.
``COMPLETED``) require ``assigned_to`` to be set. ``BLOCKED``,
``FAILED``, and ``CANCELLED`` may or may not have an assignee.
"""
requires_assignee = {
TaskStatus.ASSIGNED,
Expand Down
14 changes: 9 additions & 5 deletions src/ai_company/core/task_transitions.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,18 @@
"""Task lifecycle state machine transitions.

Defines the valid state transitions for the task lifecycle, based on
DESIGN_SPEC Section 6.1 and extended with BLOCKED and CANCELLED
transitions from IN_PROGRESS and IN_REVIEW for completeness::
DESIGN_SPEC Sections 6.1 and 6.6, extended with BLOCKED, CANCELLED, and
FAILED transitions for completeness::

CREATED -> ASSIGNED
ASSIGNED -> IN_PROGRESS | BLOCKED | CANCELLED
IN_PROGRESS -> IN_REVIEW | BLOCKED | CANCELLED
ASSIGNED -> IN_PROGRESS | BLOCKED | CANCELLED | FAILED
IN_PROGRESS -> IN_REVIEW | BLOCKED | CANCELLED | FAILED
IN_REVIEW -> COMPLETED | IN_PROGRESS (rework) | BLOCKED | CANCELLED
BLOCKED -> ASSIGNED (unblocked)
FAILED -> ASSIGNED (reassignment for retry)

COMPLETED and CANCELLED are terminal states with no outgoing
transitions.
transitions. FAILED is non-terminal (can be reassigned).
"""

from ai_company.core.enums import TaskStatus
Expand All @@ -30,13 +31,15 @@
TaskStatus.IN_PROGRESS,
TaskStatus.BLOCKED,
TaskStatus.CANCELLED,
TaskStatus.FAILED,
}
),
TaskStatus.IN_PROGRESS: frozenset(
{
TaskStatus.IN_REVIEW,
TaskStatus.BLOCKED,
TaskStatus.CANCELLED,
TaskStatus.FAILED,
}
),
TaskStatus.IN_REVIEW: frozenset(
Expand All @@ -48,6 +51,7 @@
}
),
TaskStatus.BLOCKED: frozenset({TaskStatus.ASSIGNED}),
TaskStatus.FAILED: frozenset({TaskStatus.ASSIGNED}), # reassignment
TaskStatus.COMPLETED: frozenset(), # terminal
TaskStatus.CANCELLED: frozenset(), # terminal
}
Expand Down
8 changes: 8 additions & 0 deletions src/ai_company/engine/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,11 @@
build_system_prompt,
)
from ai_company.engine.react_loop import ReactLoop
from ai_company.engine.recovery import (
FailAndReassignStrategy,
RecoveryResult,
RecoveryStrategy,
)
from ai_company.engine.run_result import AgentRunResult
from ai_company.engine.task_execution import StatusTransition, TaskExecution
from ai_company.providers.models import ZERO_TOKEN_USAGE, add_token_usage
Expand All @@ -52,11 +57,14 @@
"ExecutionLoop",
"ExecutionResult",
"ExecutionStateError",
"FailAndReassignStrategy",
"LoopExecutionError",
"MaxTurnsExceededError",
"PromptBuildError",
"PromptTokenEstimator",
"ReactLoop",
"RecoveryResult",
"RecoveryStrategy",
"StatusTransition",
"SystemPrompt",
"TaskCompletionMetrics",
Expand Down
Loading