Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 80 additions & 1 deletion DESIGN_SPEC.md
Original file line number Diff line number Diff line change
Expand Up @@ -971,7 +971,7 @@ Pipeline steps:

1. **Validate inputs** — agent must be `ACTIVE`, task must be `ASSIGNED` or `IN_PROGRESS`. Raises `ExecutionStateError` on violation.
2. **Pre-flight budget enforcement** — if `BudgetEnforcer` is provided, check monthly hard stop and daily limit via `check_can_execute()`, then apply auto-downgrade via `resolve_model()`. Raises `BudgetExhaustedError` or `DailyLimitExceededError` on violation.
3. **Build system prompt** — calls `build_system_prompt()` with agent identity, task, and available tool definitions. Follows the **non-inferable-only principle**: system prompts include only information the agent cannot discover by reading the codebase or environment (role constraints, custom conventions, organizational policies). Generic architecture overviews and file structure descriptions are excluded — [research](https://arxiv.org/abs/2602.11988) shows they reduce success rates while increasing costs 20%+.
3. **Build system prompt** — calls `build_system_prompt()` with agent identity, task, and available tool definitions. Follows the **non-inferable-only principle**: system prompts include only information the agent cannot discover by reading the codebase or environment (role constraints, custom conventions, organizational policies). Generic architecture overviews and file structure descriptions are excluded — [research](https://arxiv.org/abs/2602.11988) shows they reduce success rates while increasing costs 20%+. **Decision ([ADR-002](docs/decisions/ADR-002-design-decisions-batch-1.md) D22):** Do NOT list available tools in the system prompt — the API's `tools` parameter already injects richer tool definitions including JSON schemas. The system prompt listing is strictly inferior (no schemas) and wastes 200-400+ tokens per call. Behavioral guidance ("when to use tool X vs Y") may be added later as non-redundant value.
4. **Create context** — `AgentContext.from_identity()` with the configured `max_turns`.
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated
5. **Seed conversation** — injects system prompt, optional memory messages, and formatted task instruction as initial messages.
6. **Transition task** — `ASSIGNED` → `IN_PROGRESS` (pass-through if already `IN_PROGRESS`).
Expand Down Expand Up @@ -1600,6 +1600,8 @@ injected between system prompt and task instruction. Agent passively
receives memories.

> **Non-inferable filter:** Retrieved memories should be filtered before injection to exclude content the agent can discover by reading the codebase or environment. Only inject memories containing non-inferable information: prior decisions, learned conventions, interpersonal context, historical outcomes. [Research](https://arxiv.org/abs/2602.11988) shows generic context increases cost 20%+ with minimal success improvement; LLM-generated context can actually reduce success rates.
>
> **Decision ([ADR-002](docs/decisions/ADR-002-design-decisions-batch-1.md) D23):** Pluggable `MemoryFilterStrategy` protocol. Initial: tag-based at write time. Define `non-inferable` tag convention enforced at `MemoryBackend.store()` boundary. System prompt instructs agents what qualifies: design rationale, team decisions, "why not X", cross-repo knowledge = non-inferable; code structure, API signatures, file contents = inferable. Uses existing `MemoryMetadata.tags` and `MemoryQuery.tags` — zero new models needed. Future strategies: LLM classification at retrieval, keyword/pattern heuristic.

Pipeline: `MemoryBackend.retrieve()` -> rank by relevance+recency ->
filter by min_relevance -> greedy token-budget packing -> format as
Expand Down Expand Up @@ -1660,13 +1662,24 @@ The HR system manages the agent workforce dynamically:
4. Approved candidates are instantiated and onboarded
5. Onboarding includes: company context, project briefing, team introductions.

> **Decisions ([ADR-002](docs/decisions/ADR-002-design-decisions-batch-1.md) D8):**
>
> - **D8.1 — Source:** Templates + LLM customization. Templates for common roles (reuses existing template system §14.1). LLM generates config for novel roles not covered by templates. Approval gate catches invalid/bad configs before instantiation.
> - **D8.2 — Persistence:** Operational store via `PersistenceBackend` (§7.6). YAML stays as bootstrap seed — operational store wins for runtime state. Enables rehiring, auditable history.
> - **D8.3 — Hot-plug:** Agents are hot-pluggable at runtime via `AgentEngine.add_agent()`/`remove_agent()`. Thread-safe registry, wired into message bus + tools + budget.

Comment thread
coderabbitai[bot] marked this conversation as resolved.
### 8.2 Firing / Offboarding

1. Triggered by: budget cuts, poor performance metrics, project completion, human decision
2. Agent's memory is archived (not deleted)
3. Active tasks are reassigned
4. Team is notified

> **Decisions ([ADR-002](docs/decisions/ADR-002-design-decisions-batch-1.md) D9, D10):**
>
> - **D9 — Task Reassignment:** Pluggable `TaskReassignmentStrategy` protocol. Initial: queue-return — tasks return to unassigned queue, existing `TaskRoutingService` (§6.4) re-routes with priority boost for reassigned tasks. Future strategies: same-department/lowest-load, manager-decides (LLM), HR agent decides.
> - **D10 — Memory Archival:** Pluggable `MemoryArchivalStrategy` protocol. Initial: full snapshot, read-only. Pipeline: retrieve all → archive to `ArchivalStore` → selectively promote semantic+procedural to `OrgMemoryBackend` (rule-based) → clean hot store → mark TERMINATED. Rehiring = restore archived memories into new `AgentIdentity`. Future strategies: selective discard, full-accessible.

### 8.3 Performance Tracking

```yaml
Expand All @@ -1680,6 +1693,13 @@ agent_metrics:
last_review_date: "2026-02-20"
```

> **Decisions ([ADR-002](docs/decisions/ADR-002-design-decisions-batch-1.md) D2, D3, D11, D12):**
>
> - **D2 — Quality Scoring:** Pluggable `QualityScoringStrategy` protocol. Initial: layered combination — (1) FREE: objective CI signals (test pass/fail, lint, coverage delta), (2) ~$1/day: small-model LLM judge (different family than agent) evaluates output vs acceptance criteria, (3) on-demand: human override via API, highest weight. Start with Layer 1 only; add layers incrementally. Future strategies: CI-only, LLM-only, human-only.
> - **D3 — Collaboration Scoring:** Pluggable `CollaborationScoringStrategy` protocol. Initial: automated behavioral telemetry — `collaboration_score = weighted_average(delegation_success_rate, delegation_response_latency, conflict_resolution_constructiveness, meeting_contribution_rate, loop_prevention_score, handoff_completeness)`. Weights configurable per-role. Optional: periodic LLM sampling (1%) for calibration + human override via API. Future strategies: LLM evaluation, peer ratings, human-provided.
> - **D11 — Rolling Windows:** Pluggable `MetricsWindowStrategy` protocol. Initial: multiple simultaneous windows — 7d (acute regressions), 30d (sustained patterns), 90d (baseline/drift). Min 5 data points per window; below that, report "insufficient data." Future strategies: fixed single window, per-metric configurable.
> - **D12 — Trend Detection:** Pluggable `TrendDetectionStrategy` protocol. Initial: Theil-Sen regression slope per window + configurable thresholds classify as improving/stable/declining. Theil-Sen has 29.3% outlier breakdown (tolerates ~1 in 3 bad data points). Min 5 data points. Future strategies: period-over-period, OLS regression, threshold-only.

### 8.4 Promotions & Demotions

Agents can move between seniority levels based on performance:
Expand All @@ -1688,6 +1708,12 @@ Agents can move between seniority levels based on performance:
- Promotions can unlock higher tool access levels (see Progressive Trust)
- Model upgrades/downgrades may accompany level changes (configurable)

> **Decisions ([ADR-002](docs/decisions/ADR-002-design-decisions-batch-1.md) D13, D14, D15):**
>
> - **D13 — Promotion Criteria:** Pluggable `PromotionCriteriaStrategy` protocol. Initial: configurable threshold gates. `ThresholdEvaluator` with `min_criteria_met: int` (N of M) + `required_criteria: list[str]`. Setting min=total gives AND; min=1 gives OR. Default: junior→mid = 2 of 3 criteria, mid→senior = all. Future strategies: pure AND, pure OR.
> - **D14 — Promotion Approval:** Pluggable `PromotionApprovalStrategy` protocol. Initial: senior+ requires human approval. Junior→mid auto-promotes (low cost impact: small→medium ~4x). Demotions: auto-apply for cost-saving (model downgrade), human approval for authority-reducing demotions. Future strategies: all-human, configurable-per-level.
> - **D15 — Model Mapping:** Pluggable `ModelMappingStrategy` protocol. Initial: default ON — `hr.promotions.model_follows_seniority: true`. Model changes at task boundaries only (never mid-execution, consistent with auto-downgrade §10.4). Per-agent `preferred_model` overrides seniority default. Smart routing (§9.4) still uses cheap models for simple tasks regardless of seniority. Future strategies: always-applied, opt-in-only.

---

## 9. Model Provider Layer
Expand Down Expand Up @@ -2063,6 +2089,8 @@ The `ToolPermissionChecker` resolves permissions using a priority-based system:
Tool execution requires safety boundaries proportional to the risk of each tool category. The framework uses a **layered sandboxing strategy** with a pluggable `SandboxBackend` protocol — new backends can be added without modifying existing ones. The default configuration uses lighter isolation for low-risk tools and stronger isolation for high-risk tools.

> **MVP: Subprocess sandbox for file/git tools. Docker optional for code execution.** K8s is future.
>
> **Decision ([ADR-002](docs/decisions/ADR-002-design-decisions-batch-1.md) D16):** Docker MVP only via `aiodocker` (async-native, Python 3.14 support). Pre-built image (Python 3.14 + Node.js LTS + basic utils, <500MB) + user-configurable via `docker.image` config. **Fail with clear error** if Docker unavailable — no unsafe subprocess fallback for code execution (file/git tools already use `SubprocessSandbox`). gVisor (`--runtime=runsc`) as free config-level hardening upgrade. Evaluate WASM/Firecracker post-M7. `SandboxBackend` protocol makes adding backends trivial.

#### Sandbox Backends

Expand Down Expand Up @@ -2115,6 +2143,39 @@ sandboxing:

> **Scaling path:** In a future Kubernetes deployment (§18.2 Phase 3-4), each agent can run in its own pod via `K8sSandbox`. At that point, the layered configuration becomes less relevant — all tools execute within the agent's isolated pod. The `SandboxBackend` protocol makes this transition seamless.

### 11.1.3 MCP Integration

> **Decisions ([ADR-002](docs/decisions/ADR-002-design-decisions-batch-1.md) D17, D18):**
>
> - **D17 — MCP SDK:** Official `mcp` Python SDK, pinned `>=1.25,<2`. Thin `MCPBridgeTool` adapter layer isolates the rest of the codebase from SDK API changes. Support **stdio** (local/dev) and **Streamable HTTP** (remote/production) transports. Skip deprecated SSE. v2 migration planned — pin range prevents accidental breaking upgrade.
> - **D18 — MCP Result Mapping:** Adapter in `MCPBridgeTool` keeps `ToolResult` as-is. Mapping: text blocks → concatenate to `content: str`; image/audio → `[image: {mimeType}]` placeholder + base64 in `metadata["attachments"]`; `structuredContent` → `metadata["structured_content"]`; `isError` → `is_error` (1:1). Future: extend `ToolResult` with optional `attachments` when multi-modal LLM tool results are needed.

### 11.1.4 Action Type System

> **Decisions ([ADR-002](docs/decisions/ADR-002-design-decisions-batch-1.md) D1):**
Comment thread
coderabbitai[bot] marked this conversation as resolved.
>
> Action types classify agent actions for use by autonomy presets (§12.2), SecOps validation (§12.3), tiered timeout policies (§12.4), and progressive trust (§11.3). Three sub-decisions:
>
> - **D1.1 — Registry:** `StrEnum` for ~20 built-in action types (type safety, autocomplete, typos caught at compile time) + `ActionTypeRegistry` for custom types via explicit registration. Unknown strings rejected at config load time. Critical for security — a typo in `human_approval` list silently means "skip approval."

Copilot AI Mar 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 2159 states "~20 built-in action types" but the proposed taxonomy listed just below (and in the ADR at lines 46–57) contains 24 distinct leaf types. This count should be updated to "~25" (consistent with the taxonomy header at line 2162) or the actual number of 24.

Copilot uses AI. Check for mistakes.
> - **D1.2 — Granularity:** Two-level `category:action` hierarchy. Category shortcuts: `auto_approve: ["code"]` expands to all `code:*` actions. Fine-grained: `human_approval: ["code:create"]`.
>
> **Proposed taxonomy (~25 leaf types):**
>
> ```text
> code:read, code:write, code:create, code:delete, code:refactor
> test:write, test:run
> docs:write
> vcs:commit, vcs:push, vcs:branch
> deploy:staging, deploy:production
> comms:internal, comms:external
> budget:spend, budget:exceed
> org:hire, org:fire, org:promote
> db:query, db:mutate, db:admin
> arch:decide
> ```
>
> - **D1.3 — Classification:** Static tool metadata. Each `BaseTool` declares its `action_type`. Default mapping from `ToolCategory` → action type. Non-tool actions (`org:hire`, `budget:spend`) triggered by engine-level operations. No LLM in the security classification path.

### 11.2 Tool Access Levels

```yaml
Expand Down Expand Up @@ -2331,6 +2392,13 @@ autonomy:
security_agent: true # still runs for audit logging, but human is approval authority
```

> **Decisions ([ADR-002](docs/decisions/ADR-002-design-decisions-batch-1.md) D6, D7):**
>
> - **D6 — Autonomy Scope:** Three-level resolution chain: per-agent → per-department → company default. Optional `autonomy_level` on `AgentIdentity` and department config. Resolution: `agent.autonomy_level or department.autonomy_level or company.autonomy.level`. Seniority validation: Juniors/Interns cannot be set to `full`.
> - **D7 — Autonomy Changes at Runtime:** Pluggable `AutonomyChangeStrategy` protocol. Initial: human-only promotion via REST API. No agent (including CEO) can escalate privileges. Future strategies: human-only + auto-downgrade (on high error rate → one level down, budget exhausted → supervised, security incident → locked; recovery from auto-downgrade: human-only). Precedent: no real-world security system automatically grants higher privileges.
Comment thread
greptile-apps[bot] marked this conversation as resolved.
Outdated

Copilot AI Mar 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The D7 annotation here lists the auto-downgrade rules (on high error rate → one level down, budget exhausted → supervised, security incident → locked) as "Future strategies", but the ADR-002 D7 decision body (lines 198–202) explicitly marks (a+c hybrid) (human-only promotion + automatic downgrade) as CHOSEN and describes the auto-downgrade triggers as part of the decided implementation. This inconsistency between the ADR body and the DESIGN_SPEC annotation needs to be resolved: if auto-downgrade is intended for the initial implementation, this annotation should list it as the initial impl; if it's deferred, the ADR decision body should be corrected to reflect that.

Copilot uses AI. Check for mistakes.
>
> **Note:** The `auto_approve` / `human_approval` lists in presets above should use the `category:action` taxonomy (§11.1.4) — e.g. `auto_approve: ["code", "test", "docs", "comms:internal"]` instead of informal strings.
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated

### 12.3 Security Operations Agent

A special meta-agent that reviews all actions before execution:
Expand All @@ -2342,6 +2410,11 @@ A special meta-agent that reviews all actions before execution:
- Escalates uncertain cases to human queue with explanation
- **Cannot be overridden by other agents** (only human can override)

> **Decisions ([ADR-002](docs/decisions/ADR-002-design-decisions-batch-1.md) D4, D5):**
>
> - **D4 — LLM vs Rule-based:** Hybrid approach. Rule engine for known patterns (credentials, path traversal, destructive ops) — sub-ms, covers ~95% of cases. LLM fallback only for uncertain cases (~5%). Full autonomy mode: rules + audit logging only, no LLM path. Hard safety rules (credential exposure, data destruction) **never bypass** regardless of autonomy level. Precedent: AWS GuardDuty, LlamaFirewall, NeMo Guardrails all use hybrid.
> - **D5 — Integration Point:** Pluggable `SecurityInterceptionStrategy` protocol. Initial: before every tool invocation — slots into existing `ToolInvoker` between permission check and tool execution. Policy strictness (not interception point) configurable per autonomy level. Add post-tool-call scanning for sensitive data in outputs. Performance: sub-ms rule check is invisible against seconds of LLM inference. Future strategies: batch-level (before task step), assignment-only.

### 12.4 Approval Timeout Policy

When an action requires human approval (per autonomy level in §12.2), the agent must wait. The framework provides configurable timeout policies that determine what happens when a human doesn't respond. All policies implement a `TimeoutPolicy` protocol. The policy is configurable per autonomy level and per action risk tier.
Expand Down Expand Up @@ -2422,6 +2495,12 @@ approval_timeout:

> **Task Suspension and Resumption:** The park/resume mechanism relies on `AgentContext` snapshots (frozen Pydantic models). When a task is parked, the full context is persisted. When approval arrives, the framework loads the snapshot, restores the agent's conversation and state, and resumes execution from the exact point of suspension. This works naturally with the `model_copy(update=...)` immutability pattern — the snapshot is a complete, self-contained state.

> **Decisions ([ADR-002](docs/decisions/ADR-002-design-decisions-batch-1.md) D19, D20, D21):**
>
> - **D19 — Risk Tier Classification:** Pluggable `RiskTierClassifier` protocol. Initial: configurable YAML mapping — `RiskTierMapping` config model with `dict[str, ApprovalRiskLevel]`. Sensible defaults matching examples above (e.g. `code:write` → low, `deploy:production` → critical). Unknown action types default to HIGH (fail-safe). Hot-reloadable. Leaves door open for SecOps override in M7. Future strategies: SecOps-assigned, fixed-per-type.
> - **D20 — Context Serialization:** Pydantic JSON via persistence backend. `ParkedContext` model with metadata columns (`execution_id`, `agent_id`, `task_id`, `parked_at`) + `context_json` blob. `ParkedContextRepository` protocol via existing `PersistenceBackend` (§7.6). Conversation stored **verbatim** — summarization is a context window management concern at resume time, not a persistence concern.
> - **D21 — Resume Injection:** Tool result injection. Approval requests modeled as tool calls (`request_human_approval`). Approval decision returned as `ToolResult` — semantically correct (approval IS the tool's return value). LLM conversation protocol requires a tool result after a tool call. Fallback: system message injection for engine-initiated parking (exception path).

---

## 13. Human Interaction Layer
Expand Down
Loading