Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 23 additions & 6 deletions DESIGN_SPEC.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,10 +165,21 @@ agent:
hiring_date: "2026-02-27"
status: "active" # active, on_leave, terminated (on config model today)

# --- Planned (M3): Runtime state — AgentRuntimeState (mutable-via-copy) ---
# current_task_id: "task-456"
# turn_count: 12
# accumulated_cost_usd: 1.23
# --- Adopted (M3): Runtime state — engine/ (frozen + model_copy) ---
# TaskExecution wraps Task with evolving execution state:
# status: TaskStatus # evolves via with_transition()
# transition_log: tuple[StatusTransition, ...]
# accumulated_cost: TokenUsage # running totals
# turn_count: int # LLM turns completed
# started_at / completed_at: AwareDatetime | None
#
# AgentContext wraps AgentIdentity + TaskExecution with:
# execution_id: str # uuid4, unique per run
# conversation: tuple[ChatMessage, ...]
# accumulated_cost: TokenUsage # running totals
# turn_count: int # LLM turns completed
# max_turns: int # hard limit (default 20)
# started_at: AwareDatetime
```

### 3.2 Seniority & Authority Levels
Expand Down Expand Up @@ -535,6 +546,8 @@ When a loop is detected, the framework:
└────────────┘
```

> **Runtime wrapper (M3):** During execution, `Task` is wrapped by `TaskExecution` (in `engine/task_execution.py`). `TaskExecution` is a frozen Pydantic model that tracks status transitions via `model_copy(update=...)`, accumulates `TokenUsage` cost, and records a `StatusTransition` audit trail. The original `Task` is preserved unchanged; `to_task_snapshot()` produces a `Task` copy with the current execution status for persistence.

### 6.2 Task Definition

```yaml
Expand Down Expand Up @@ -1288,8 +1301,12 @@ ai-company/
│ │ ├── artifact.py # Produced work items
│ │ ├── role.py # Role model
│ │ └── role_catalog.py # Role catalog
│ ├── engine/ # Core engines (M3+, stubs only)
│ ├── engine/ # Core engines (M3+)
│ │ ├── errors.py # Engine error hierarchy (M3)
│ │ ├── prompt.py # System prompt builder (M3)
│ │ ├── prompt_template.py # System prompt Jinja2 templates (M3)
│ │ ├── task_execution.py # TaskExecution + StatusTransition (M3)
│ │ ├── context.py # AgentContext + AgentContextSnapshot (M3)
│ │ ├── agent_engine.py # Agent execution loop (M3)
│ │ ├── task_engine.py # Task routing & scheduling (M3-M4)
│ │ ├── workflow_engine.py # Workflow orchestration (M4)
Expand Down Expand Up @@ -1420,7 +1437,7 @@ These conventions were established during the M0–M2 review cycle. **Adopted**
| Convention | Status | Decision | Rationale |
|------------|--------|----------|-----------|
| **Immutability strategy** | Adopted | `MappingProxyType` at construction for dict fields in registries and collections; `frozen=True` on all config/identity models | MappingProxyType is O(1) and prevents accidental mutation. Pydantic `frozen=True` is confirmed shallow (pydantic#7784). |
| **Config vs runtime split** | Planned (M3) | Frozen models for config/identity; `model_copy(update=...)` for runtime state transitions | Frozen models cannot represent evolving state without serialize/validate round-trips. Separate models keep config immutable while state is explicit. Currently only config layer exists (`AgentIdentity`). |
| **Config vs runtime split** | Adopted (M3) | Frozen models for config/identity; `model_copy(update=...)` for runtime state transitions | `TaskExecution` and `AgentContext` (in `engine/`) are frozen Pydantic models that use `model_copy(update=...)` for copy-on-write state transitions without re-running validators (per Pydantic `model_copy` semantics). Config layer (`AgentIdentity`, `Task`) remains unchanged. |
| **Derived fields** | Planned | `@computed_field` instead of stored + validated | Eliminates redundant storage and impossible-to-fail validators (e.g. `total_tokens = input + output`). Currently `total_tokens` uses stored `Field` + `@model_validator`. |
| **String validation** | Planned | `NotBlankStr` type from `core.types` for all identifiers | Eliminates per-model `@model_validator` boilerplate for whitespace checks. `NotBlankStr` is defined but models still use `Field(min_length=1)` + manual validators. |
| **Shared field groups** | Planned | Extract common field sets into base models (e.g. `_SpendingTotals`) | Prevents field duplication across spending summary models. Not yet implemented — each model independently defines fields. |
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ AI Company lets you spin up a virtual organization staffed entirely by AI agents

## Status

**M2: Provider Layer** complete (M0 Tooling, M1 Config & Core, M2 Providers — all done). See [DESIGN_SPEC.md](DESIGN_SPEC.md) for the full high-level specification.
**M3: Single Agent** in progress (M0 Tooling, M1 Config & Core, M2 Providers — all done). See [DESIGN_SPEC.md](DESIGN_SPEC.md) for the full high-level specification.

## Tech Stack

Expand Down
26 changes: 24 additions & 2 deletions src/ai_company/engine/__init__.py
Original file line number Diff line number Diff line change
@@ -1,21 +1,43 @@
"""Agent execution engine.

Re-exports the public API for system prompt construction.
Re-exports the public API for system prompt construction,
runtime execution state, and engine errors.
"""

from ai_company.engine.errors import EngineError, PromptBuildError
from ai_company.engine.context import (
DEFAULT_MAX_TURNS,
AgentContext,
AgentContextSnapshot,
)
from ai_company.engine.errors import (
EngineError,
ExecutionStateError,
MaxTurnsExceededError,
PromptBuildError,
)
from ai_company.engine.prompt import (
DefaultTokenEstimator,
PromptTokenEstimator,
SystemPrompt,
build_system_prompt,
)
from ai_company.engine.task_execution import StatusTransition, TaskExecution
from ai_company.providers.models import ZERO_TOKEN_USAGE, add_token_usage

__all__ = [
"DEFAULT_MAX_TURNS",
"ZERO_TOKEN_USAGE",
"AgentContext",
"AgentContextSnapshot",
"DefaultTokenEstimator",
"EngineError",
"ExecutionStateError",
"MaxTurnsExceededError",
"PromptBuildError",
"PromptTokenEstimator",
"StatusTransition",
"SystemPrompt",
"TaskExecution",
"add_token_usage",
"build_system_prompt",
]
Comment on lines 27 to 43
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better maintainability and readability, it's a good practice to keep the __all__ list sorted alphabetically. This makes it easier to find exports as the list grows. The previous __all__ list in this file was sorted, so this change aligns with the existing project convention.

Suggested change
__all__ = [
"DEFAULT_MAX_TURNS",
"ZERO_TOKEN_USAGE",
"AgentContext",
"AgentContextSnapshot",
"DefaultTokenEstimator",
"EngineError",
"ExecutionStateError",
"MaxTurnsExceededError",
"PromptBuildError",
"PromptTokenEstimator",
"StatusTransition",
"SystemPrompt",
"TaskExecution",
"add_token_usage",
"build_system_prompt",
]
__all__ = [
"AgentContext",
"AgentContextSnapshot",
"DEFAULT_MAX_TURNS",
"DefaultTokenEstimator",
"EngineError",
"ExecutionStateError",
"MaxTurnsExceededError",
"PromptBuildError",
"PromptTokenEstimator",
"StatusTransition",
"SystemPrompt",
"TaskExecution",
"ZERO_TOKEN_USAGE",
"add_token_usage",
"build_system_prompt",
]

Loading