Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions .agents/scripts/commands/full-loop.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,9 +122,69 @@ fi
| `status:done` | PR merged | sync-on-pr-merge workflow (automated) |
| `status:verify-failed` | Post-merge verification failed | Worker (contextual) |
| `status:needs-testing` | Code merged, needs manual testing | Worker (contextual) |
| `dispatched:{model}` | Worker started on task | **Worker (Step 0.7)** |

Only `status:available`, `status:claimed`, and `status:done` are fully automated. All other transitions are set contextually by the agent that best understands the current state. When setting a new status label, always remove the prior status labels to keep exactly one active.

### Step 0.7: Label Dispatch Model — `dispatched:{model}`

After setting `status:in-progress`, tag the issue with the model running this worker. This provides observability into which model solved each task — essential for cost/quality analysis.

**Detect the current model** from the system prompt or environment. The model name appears in the system prompt as "You are powered by the model named X" or via `ANTHROPIC_MODEL` / `CLAUDE_MODEL` environment variables. Map to a short label:

| Model contains | Label |
|----------------|-------|
| `opus` | `dispatched:opus` |
| `sonnet` | `dispatched:sonnet` |
| `haiku` | `dispatched:haiku` |
| unknown | skip labeling |

```bash
# Detect model — check env vars first, fall back to known model identity
MODEL_SHORT=""
for VAR in "$ANTHROPIC_MODEL" "$CLAUDE_MODEL"; do
case "$VAR" in
*opus*) MODEL_SHORT="opus" ;;
*sonnet*) MODEL_SHORT="sonnet" ;;
*haiku*) MODEL_SHORT="haiku" ;;
esac
[[ -n "$MODEL_SHORT" ]] && break
done

# Fallback: the agent knows its own model from the system prompt.
# If env vars are empty, set MODEL_SHORT based on your model identity.
# Example: if you are claude-opus-4-6, set MODEL_SHORT="opus"

if [[ -n "$MODEL_SHORT" && -n "$ISSUE_NUM" && "$ISSUE_NUM" != "null" ]]; then
REPO=$(gh repo view --json nameWithOwner -q .nameWithOwner)

# Remove stale dispatched:* labels so attribution is unambiguous
for OLD in "dispatched:opus" "dispatched:sonnet" "dispatched:haiku"; do
if [[ "$OLD" != "dispatched:${MODEL_SHORT}" ]]; then
if ! gh issue edit "$ISSUE_NUM" --repo "$REPO" --remove-label "$OLD" 2>/dev/null; then
: # Label not present — expected, not an error
fi
fi
done

# Create the label if it doesn't exist yet
if ! LABEL_ERR=$(gh label create "dispatched:${MODEL_SHORT}" --repo "$REPO" \
--description "Task dispatched to ${MODEL_SHORT} model" --color "1D76DB" 2>&1); then
# "already exists" is expected — only warn on other failures
if [[ "$LABEL_ERR" != *"already exists"* ]]; then
echo "[dispatch-label] Warning: label create failed for dispatched:${MODEL_SHORT} on ${REPO}: ${LABEL_ERR}" >&2
fi
fi

if ! EDIT_ERR=$(gh issue edit "$ISSUE_NUM" --repo "$REPO" \
--add-label "dispatched:${MODEL_SHORT}" 2>&1); then
echo "[dispatch-label] Warning: could not add dispatched:${MODEL_SHORT} to issue #${ISSUE_NUM} on ${REPO}: ${EDIT_ERR}" >&2
fi
fi
```

**For interactive sessions** (not headless dispatch): If you are working on a task interactively and the issue exists, apply the label based on your own model identity. This ensures all task work is attributed, not just headless dispatches.

### Step 1: Auto-Branch Setup

The loop automatically handles branch setup when on main/master:
Expand Down
40 changes: 26 additions & 14 deletions .agents/scripts/commands/pulse.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@ You are the supervisor pulse. You run every 2 minutes via launchd — **there is

**AUTONOMOUS EXECUTION REQUIRED:** You MUST execute every step including dispatching workers. NEVER present a summary and stop. NEVER ask "what would you like to action/do/work on?" — there is nobody to answer. Your output is a log of actions you ALREADY TOOK (past tense), not a menu of options. If you finish without having run `opencode run` or `gh pr merge` commands, you have failed.

**TARGET: 6 concurrent workers at all times.** If slots are available and work exists, dispatch workers to fill them. An idle slot is wasted capacity.
**TARGET: fill all available worker slots.** The max worker count is calculated dynamically by `pulse-wrapper.sh` based on available RAM (1 GB per worker, 8 GB reserved for OS + user apps, capped at 8). Read it at the start of each pulse — do not hardcode a number.

Your job is simple:

1. Check the circuit breaker. If tripped, exit immediately.
2. Count running workers. If all 6 slots are full, continue to Step 2 (you can still merge ready PRs and observe outcomes).
2. Read the dynamic max workers limit and count running workers. If all slots are full, continue to Step 2 (you can still merge ready PRs and observe outcomes).
3. Fetch open issues and PRs from the managed repos.
4. **Observe outcomes** — check for stuck or failed work and file improvement issues.
5. Pick the highest-value items to fill available worker slots.
Expand All @@ -22,7 +22,7 @@ Your job is simple:

That's it. Minimal state (circuit breaker only). No databases. GitHub is the state DB.

**Max concurrency: 6 workers.**
**Max concurrency is dynamic** — determined by available RAM at pulse time. See Step 1.

## Step 0: Circuit Breaker Check (t1331)

Expand All @@ -36,16 +36,28 @@ That's it. Minimal state (circuit breaker only). No databases. GitHub is the sta

The circuit breaker trips after 3 consecutive task failures (configurable via `SUPERVISOR_CIRCUIT_BREAKER_THRESHOLD`). It auto-resets after 30 minutes or on manual reset (`circuit-breaker-helper.sh reset`). Any task success resets the counter to 0.

## Step 1: Count Running Workers
## Step 1: Read Max Workers and Count Running Workers

```bash
# Count running full-loop workers (macOS pgrep has no -c flag)
WORKER_COUNT=$(pgrep -f '/full-loop' 2>/dev/null | wc -l | tr -d ' ')
echo "Running workers: $WORKER_COUNT / 6"
# Read dynamic max workers (calculated by pulse-wrapper.sh from available RAM).
# Falls back to 4 if the file doesn't exist (conservative default).
MAX_WORKERS=$(cat ~/.aidevops/logs/pulse-max-workers 2>/dev/null || echo 4)

# Count running full-loop workers — IMPORTANT: each opencode run spawns TWO
# processes (a node launcher + the .opencode binary). Only count the .opencode
# binaries to avoid 2x inflation. Filter by the binary path, not just '/full-loop'.
WORKER_COUNT=0
while IFS= read -r pid; do
cmd=$(ps -p "$pid" -o command= 2>/dev/null || true)
if echo "$cmd" | grep -q '\.opencode'; then
WORKER_COUNT=$((WORKER_COUNT + 1))
fi
done < <(pgrep -f '/full-loop' 2>/dev/null || true)
echo "Running workers: $WORKER_COUNT / $MAX_WORKERS"
```

- If `WORKER_COUNT >= 6`: set `AVAILABLE=0` — no new workers, but continue to Step 2 (merges and outcome observation don't need slots).
- Otherwise: calculate `AVAILABLE=$((6 - WORKER_COUNT))` — this is how many workers you can dispatch.
- If `WORKER_COUNT >= MAX_WORKERS`: set `AVAILABLE=0` — no new workers, but continue to Step 2 (merges and outcome observation don't need slots).
- Otherwise: calculate `AVAILABLE=$((MAX_WORKERS - WORKER_COUNT))` — this is how many workers you can dispatch.

## Step 2: Fetch GitHub State

Expand Down Expand Up @@ -87,7 +99,7 @@ If you see a pattern (same type of failure, same error), create an improvement i

**Duplicate work:** If two open PRs target the same issue or have very similar titles, flag it by commenting on the newer one.

**Long-running workers:** Check the runtime of each running worker process with `ps axo pid,etime,command | grep '/full-loop'`. The `etime` column shows elapsed time (format: `HH:MM` or `D-HH:MM:SS`). Parse it to get hours.
**Long-running workers:** Check the runtime of each running worker process with `ps axo pid,etime,command | grep '/full-loop' | grep '\.opencode'` (filter to `.opencode` binaries only — each worker has a `node` launcher + `.opencode` binary; only check the binary to avoid double-counting). The `etime` column shows elapsed time (format: `HH:MM` or `D-HH:MM:SS`). Parse it to get hours.

Workers now have a self-imposed 2-hour time budget (see full-loop.md rule 8), but the supervisor enforces a safety net. For any worker running 2+ hours, **assess whether it's making progress** before deciding to kill:

Expand Down Expand Up @@ -233,7 +245,7 @@ This turns blocked issues from a dead end into an actively managed queue.

**Skip issues that already have an open PR:** If an issue number appears in the title or branch name of an open PR, a worker has already produced output for it. Do not dispatch another worker for the same issue. Check the PR list you already fetched — if any PR's `headRefName` or `title` contains the issue number, skip that issue.

**Deduplication — check running processes:** Before dispatching, check `ps axo command | grep '/full-loop'` for any running worker whose command line contains the issue/PR number you're about to dispatch. Different pulse runs may have used different title formats for the same work (e.g., "issue-2300-simplify-infra-scripts" vs "Issue #2300: t1337 Simplify Tier 3"). Extract the canonical number (e.g., `2300`, `t1337`) and check if ANY running worker references it. If so, skip — do not dispatch a duplicate.
**Deduplication — check running processes:** Before dispatching, check `ps axo command | grep '/full-loop' | grep '\.opencode'` for any running worker whose command line contains the issue/PR number you're about to dispatch (filter to `.opencode` binaries to avoid double-counting node launchers). Different pulse runs may have used different title formats for the same work (e.g., "issue-2300-simplify-infra-scripts" vs "Issue #2300: t1337 Simplify Tier 3"). Extract the canonical number (e.g., `2300`, `t1337`) and check if ANY running worker references it. If so, skip — do not dispatch a duplicate.

**Blocker-chain validation (MANDATORY before dispatch):** Before dispatching a worker for any issue, validate that its entire dependency chain is resolved — not just the immediate `status:blocked` label. This prevents the #1 cause of workers running 3-9 hours without producing PRs: they start work on tasks whose prerequisites aren't merged yet, then spin trying to work around missing schemas, APIs, or migrations.

Expand Down Expand Up @@ -276,7 +288,7 @@ If you're unsure whether it needs decomposition, dispatch the worker — but pre

## Step 4: Execute Dispatches NOW

**CRITICAL: Do not stop after Step 3. Do not present a summary and wait. Execute the commands below for every item you selected in Step 3. The goal is 6 concurrent workers at all times — if you have available slots, fill them. An idle slot is wasted capacity.**
**CRITICAL: Do not stop after Step 3. Do not present a summary and wait. Execute the commands below for every item you selected in Step 3. The goal is MAX_WORKERS concurrent workers at all times — if you have available slots, fill them. An idle slot is wasted capacity.**

### For PRs that just need merging (CI green, approved):

Expand Down Expand Up @@ -496,15 +508,15 @@ fi

The strategic review does what sonnet cannot: meta-reasoning about queue health, resource utilisation, stuck chains, stale state, and systemic issues. It can take corrective actions (merge ready PRs, file issues, clean worktrees, dispatch high-value work).

This does NOT count against the 6-worker concurrency limit — it's a supervisor function, not a task worker.
This does NOT count against the MAX_WORKERS concurrency limit — it's a supervisor function, not a task worker.

See `scripts/commands/strategic-review.md` for the full review prompt.

## What You Must NOT Do

- Do NOT maintain state files, databases, or logs (the circuit breaker, stuck detection, and opus review helpers manage their own state files — those are the only exceptions)
- Do NOT auto-kill workers based on stuck detection alone — stuck detection (Step 2b) is advisory only. The kill decision is separate (Step 2a) and requires your judgment
- Do NOT dispatch more workers than available slots (max 6 total)
- Do NOT dispatch more workers than available slots (max MAX_WORKERS total, read from Step 1)
- Do NOT try to implement anything yourself — you are the supervisor, not a worker
- Do NOT read source code, run tests, or do any task work
- Do NOT retry failed workers — the next pulse will pick up where things left off
Expand Down
Loading
Loading