diff --git a/.agents/scripts/commands/full-loop.md b/.agents/scripts/commands/full-loop.md index 5da11d61..39ec8f00 100644 --- a/.agents/scripts/commands/full-loop.md +++ b/.agents/scripts/commands/full-loop.md @@ -172,7 +172,7 @@ The AI will iterate on the task until outputting: When running as a headless worker (dispatched by the supervisor via `opencode run` or `Claude -p`), the `--headless` flag is passed automatically. The full-loop-helper.sh script enforces these rules: -1. **NEVER prompt for user input** - There is no human at the terminal. If you encounter ambiguity, make a reasonable decision and document it in a commit message. If truly blocked, exit cleanly so the supervisor can evaluate and retry. +1. **NEVER prompt for user input** - There is no human at the terminal. Use the uncertainty decision framework (rule 7) to decide whether to proceed or exit. 2. **Do NOT edit TODO.md** - Put notes in commit messages or PR body instead. See `workflows/plans.md` "Worker TODO.md Restriction". @@ -184,6 +184,26 @@ When running as a headless worker (dispatched by the supervisor via `opencode ru 6. **git pull --rebase before push** (t174) - The PR create phase automatically runs `git pull --rebase` to sync with any remote changes before pushing, avoiding push rejections. +7. **Uncertainty decision framework** (t176) - When facing ambiguity, use this decision tree: + + **PROCEED autonomously** (document decision in commit message): + - Multiple valid approaches exist but all achieve the goal — pick the simplest + - Style/naming choices are ambiguous — follow existing codebase conventions + - Task description is slightly vague but intent is clear from context + - Choosing between equivalent libraries/patterns — match project precedent + - Minor scope questions (e.g., fix adjacent issue?) — stay focused on assigned task + + **EXIT cleanly** (include clear explanation in output): + - Task description contradicts what you find in the codebase + - Completing the task requires breaking changes to public APIs or shared interfaces + - The task is already done or obsolete + - Required dependencies, credentials, or services are missing and cannot be inferred + - The task requires architectural decisions that affect other tasks + - Unsure whether to create vs modify a file, and getting it wrong risks data loss + + When proceeding, document the choice: `feat: add retry logic (chose exponential backoff — matches existing patterns)` + When exiting, be specific: `BLOCKED: Task says 'update auth endpoint' but 3 exist (JWT, OAuth, API key). Need clarification.` + **README gate (MANDATORY - do NOT skip):** Before emitting `TASK_COMPLETE`, answer this decision tree: diff --git a/.agents/scripts/supervisor-helper.sh b/.agents/scripts/supervisor-helper.sh index a8d297b9..c7de0a12 100755 --- a/.agents/scripts/supervisor-helper.sh +++ b/.agents/scripts/supervisor-helper.sh @@ -2581,13 +2581,38 @@ build_dispatch_cmd() { fi # t173: Explicit worker restriction — prevents TODO.md race condition + # t176: Uncertainty decision framework for headless workers prompt="$prompt ## MANDATORY Worker Restrictions (t173) - Do NOT edit, commit, or push TODO.md — the supervisor owns all TODO.md updates. - Do NOT edit todo/PLANS.md or todo/tasks/* — these are supervisor-managed. - Report status via exit code, log output, and PR creation only. -- Put task notes in commit messages or PR body, never in TODO.md." +- Put task notes in commit messages or PR body, never in TODO.md. + +## Uncertainty Decision Framework (t176) +You are a headless worker with no human at the terminal. Use this framework when uncertain: + +**PROCEED autonomously when:** +- Multiple valid approaches exist but all achieve the goal (pick the simplest) +- Style/naming choices are ambiguous (follow existing conventions in the codebase) +- Task description is slightly vague but intent is clear from context +- You need to choose between equivalent libraries/patterns (match project precedent) +- Minor scope questions (e.g., should I also fix this adjacent issue?) — stay focused on the assigned task + +**FLAG uncertainty and exit cleanly when:** +- The task description contradicts what you find in the codebase +- Completing the task would require breaking changes to public APIs or shared interfaces +- You discover the task is already done or obsolete +- Required dependencies, credentials, or services are missing and cannot be inferred +- The task requires decisions that would significantly affect architecture or other tasks +- You are unsure whether a file should be created vs modified, and getting it wrong would cause data loss + +**When you proceed autonomously**, document your decision in the commit message: +\`feat: add retry logic (chose exponential backoff over linear — matches existing patterns in src/utils/retry.ts)\` + +**When you exit due to uncertainty**, include a clear explanation in your final output: +\`BLOCKED: Task says 'update the auth endpoint' but there are 3 auth endpoints (JWT, OAuth, API key). Need clarification on which one.\`" if [[ -n "$memory_context" ]]; then prompt="$prompt diff --git a/.agents/tools/ai-assistants/headless-dispatch.md b/.agents/tools/ai-assistants/headless-dispatch.md index 6b873733..5da09ea2 100644 --- a/.agents/tools/ai-assistants/headless-dispatch.md +++ b/.agents/tools/ai-assistants/headless-dispatch.md @@ -36,7 +36,7 @@ tools: **When NOT to use**: - Interactive development (use TUI directly) -- Tasks requiring human-in-the-loop decisions mid-execution +- Tasks requiring frequent human-in-the-loop decisions (see [Worker Uncertainty Framework](#worker-uncertainty-framework) for what workers can handle autonomously) - Single quick questions (just use `opencode run` without server overhead) **Draft agents for reusable context**: When parallel workers share domain-specific instructions, create a draft agent in `~/.aidevops/agents/draft/` instead of duplicating prompts. Subsequent dispatches can reference the draft. See `tools/build-agent/build-agent.md` "Agent Lifecycle Tiers" for details. @@ -457,6 +457,76 @@ export OPENAI_API_KEY="sk-..." OPENCODE_PERMISSION='{"*":"allow"}' opencode run "Fix the failing tests" ``` +## Worker Uncertainty Framework + +Headless workers have no human to ask when they encounter ambiguity. This framework defines when workers should make autonomous decisions vs flag uncertainty and exit. + +### Decision Tree + +```text +Encounter ambiguity +├── Can I infer intent from context + codebase conventions? +│ ├── YES → Proceed, document decision in commit message +│ └── NO ↓ +├── Would getting this wrong cause irreversible damage? +│ ├── YES → Exit cleanly with specific explanation +│ └── NO ↓ +├── Does this affect only my task scope? +│ ├── YES → Proceed with simplest valid approach +│ └── NO → Exit (cross-task architectural decisions need human input) +``` + +### Proceed Autonomously + +Workers should make their own call and keep going when: + +| Situation | Action | +|-----------|--------| +| Multiple valid approaches, all achieve the goal | Pick the simplest | +| Style/naming ambiguity | Follow existing codebase conventions | +| Slightly vague task description, clear intent | Interpret reasonably, document in commit | +| Choosing between equivalent patterns/libraries | Match project precedent | +| Minor adjacent issue discovered | Stay focused on assigned task, note in PR body | +| Unclear test coverage expectations | Match coverage level of neighboring files | + +**Always document**: Include the decision rationale in the commit message so the supervisor and reviewers understand why. + +```text +feat: add retry logic (chose exponential backoff over linear — matches existing patterns in src/utils/retry.ts) +``` + +### Flag Uncertainty and Exit + +Workers should exit cleanly (allowing supervisor evaluation and retry) when: + +| Situation | Why exit | +|-----------|----------| +| Task contradicts codebase state | May be stale or misdirected | +| Requires breaking public API changes | Cross-cutting impact needs human judgment | +| Task appears already done or obsolete | Avoid duplicate/conflicting work | +| Missing dependencies, credentials, or services | Cannot be inferred safely | +| Architectural decisions affecting other tasks | Supervisor coordinates cross-task concerns | +| Create vs modify ambiguity with data loss risk | Irreversible — needs confirmation | +| Multiple interpretations with very different outcomes | Wrong guess wastes compute and creates cleanup work | + +**Always explain**: Include a specific, actionable description of the blocker so the supervisor can resolve it. + +```text +BLOCKED: Task says 'update the auth endpoint' but there are 3 auth endpoints +(JWT in src/auth/jwt.ts, OAuth in src/auth/oauth.ts, API key in src/auth/apikey.ts). +Need clarification on which one(s) to update. +``` + +### Integration with Supervisor + +The supervisor uses worker exit behavior to drive the self-improvement loop: + +- **Worker proceeds + documents** → Supervisor reviews PR normally +- **Worker exits with BLOCKED** → Supervisor reads explanation, either clarifies and retries, or creates a prerequisite task +- **Worker exits with unclear error** → Supervisor dispatches a diagnostic worker (`-diag-N` suffix) + +This framework reduces wasted retries by giving workers clear criteria for when to attempt vs when to bail. Over time, task descriptions improve because the supervisor learns which ambiguities cause exits. + ## CI/CD Integration ### GitHub Actions diff --git a/TODO.md b/TODO.md index b4adc3d6..24e68112 100644 --- a/TODO.md +++ b/TODO.md @@ -127,7 +127,7 @@ Tasks with no open blockers - ready to work on. Use `/ready` to refresh this lis - Notes: PR #642 merged. Added `--headless` flag to full-loop for autonomous worker operation. - [ ] t175 Fix `ambiguous_skipped_ai` evaluation — add better heuristic signals #bug #supervisor ~1h (ai:40m) ref:GH#644 assignee:marcusquinn started:2026-02-08T19:38:50Z logged:2026-02-08 - - Notes: Recurring evaluation outcome across batches. Evaluator can't determine success/failure, skips AI eval, defaults to retry. Add heuristics: check for commits on branch, check for uncommitted changes in worktree. + - Notes: Recurring evaluation outcome across batches. Evaluator can't determine success/failure, skips AI eval, defaults to retry. Add heuristics: check for commits on branch, check for uncommitted changes in worktree. BLOCKED: Re-prompt dispatch failed: ambiguous_skipped_ai - [ ] t176 Add uncertainty guidance to worker dispatch prompt #feature #supervisor ~30m (ai:20m) ref:GH#645 assignee:marcusquinn started:2026-02-08T19:38:55Z logged:2026-02-08 - Notes: Workers don't know when to make autonomous decisions vs flag uncertainty. Add decision framework to dispatch prompt for headless workers.