feat: add worker dispatch with worktree isolation to supervisor (t128.2) #377

marcusquinn · 2026-02-06T04:39:07Z

Summary

Adds 4 new commands to supervisor-helper.sh for autonomous worker dispatch: dispatch, pulse, worker-status, cleanup
Each task gets its own worktree (wt switch -c feature/tXXX) with AI worker dispatched via opencode run --format json "/full-loop tXXX"
Concurrency semaphore (default 4 workers) enforced at both batch and global levels

Details

New Commands

Command	Description
`dispatch <task_id>`	Creates worktree, starts AI worker in background, tracks PID
`pulse [--batch id]`	Stateless supervisor cycle: evaluate completed workers, dispatch queued tasks
`worker-status <task_id>`	Check worker process liveness, log signals (FULL_LOOP_COMPLETE), PR URLs
`cleanup [--dry-run]`	Remove worktrees for terminal tasks, clean stale PID files

Key Features

Worktree isolation: Each task gets ~/Git/{repo}.feature-{tXXX}/ via wt or git worktree
Concurrency control: SUPERVISOR_MAX_CONCURRENCY env var or batch --concurrency flag
Tabby tab detection: Auto-detects TERM_PROGRAM=Tabby for visual dispatch mode
Log-based evaluation: Parses worker logs for completion signals, error patterns, PR URLs
Automatic retry/block/fail: Rate limits → retry, auth errors → blocked, max retries → failed
Mail escalation: Blocked tasks trigger mail-helper.sh notification
AI CLI auto-detection: Prefers opencode, falls back to claude

Quality

Zero ShellCheck violations
local var="$1" pattern throughout
Explicit returns in all functions
All new functions follow existing code conventions

Testing

Tested manually:

help - shows all new commands and options
init + add + dispatch - full lifecycle
pulse - evaluates workers and dispatches queued tasks
worker-status - shows process state and log signals
cleanup --dry-run - lists worktrees to clean
Concurrency limit enforcement (exit code 2)
Error handling for missing tasks, invalid states

Summary by CodeRabbit

New Features
- Task dispatching with support for concurrent execution and environment-based dispatch modes
- Real-time worker status monitoring with detailed reporting including logs and signal analysis
- Periodic supervisor cycles to automatically schedule tasks within concurrency limits
- Automatic cleanup and removal of completed task worktrees

Add 4 new commands to supervisor-helper.sh for autonomous worker dispatch: - dispatch: Creates worktree per task (wt/git), starts AI worker in background, tracks PID for monitoring - pulse: Stateless supervisor cycle - evaluates completed workers, dispatches queued tasks up to concurrency limit - worker-status: Checks worker process liveness, log signals, PR URLs - cleanup: Removes worktrees for terminal tasks, cleans stale PIDs Key features: - Concurrency semaphore (default 4, configurable via env/batch) - Tabby tab detection for visual dispatch mode - Log-based outcome evaluation (FULL_LOOP_COMPLETE, error patterns) - Automatic retry/block/fail classification - Mail escalation for blocked tasks - opencode/claude CLI auto-detection Zero ShellCheck violations.

gemini-code-assist · 2026-02-06T04:39:10Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

coderabbitai · 2026-02-06T04:39:36Z

Walkthrough

Introduces comprehensive task dispatch and lifecycle management to a supervisor helper script, enabling task worktree creation, parallel worker execution, state transitions, status monitoring, periodic evaluation cycles, and automated cleanup.

Changes

Cohort / File(s)	Summary
Task Dispatch & Lifecycle Management `.agent/scripts/supervisor-helper.sh`	Added 10 functions implementing end-to-end task dispatch orchestration: dispatch mode detection (headless/tabby), AI CLI resolution (opencode/claude fallback), dispatch command construction, Git worktree management (creation/cleanup with wt/git fallback), dispatch orchestration with concurrency limits and state transitions, worker status reporting with PID liveness checks and log analysis, worker outcome evaluation (complete/retry/blocked/failed), periodic supervisor pulse cycles, and completed task cleanup with optional dry-run mode. Expanded usage documentation for new commands and options.

Sequence Diagram(s)

sequenceDiagram
    actor Supervisor
    participant DispatchCmd as cmd_dispatch
    participant WorktreeOps as Worktree Mgmt
    participant AICliCmd as AI CLI
    participant Worker as Background Worker
    participant Database as State DB

    Supervisor->>DispatchCmd: dispatch task
    DispatchCmd->>Database: validate input & check concurrency
    DispatchCmd->>WorktreeOps: create/reuse worktree
    WorktreeOps-->>DispatchCmd: worktree ready
    DispatchCmd->>Database: transition state
    DispatchCmd->>AICliCmd: build dispatch command
    AICliCmd-->>DispatchCmd: command constructed
    DispatchCmd->>Worker: spawn background worker (fork)
    Worker->>Database: store PID
    Worker-->>Supervisor: backgrounded

sequenceDiagram
    actor Supervisor
    participant Pulse as cmd_pulse
    participant StatusCmd as cmd_worker_status
    participant EvalCmd as evaluate_worker
    participant Database as State DB
    participant WorkerLog as Worker Logs
    participant CleanupCmd as cmd_cleanup

    Supervisor->>Pulse: periodic pulse cycle
    Pulse->>StatusCmd: query active workers
    StatusCmd->>Database: fetch PID & metadata
    StatusCmd->>WorkerLog: read logs (signals, exit code)
    StatusCmd-->>Pulse: worker status report
    Pulse->>EvalCmd: evaluate worker outcomes
    EvalCmd->>WorkerLog: analyze signals & output
    EvalCmd-->>Pulse: outcome (complete/retry/blocked/failed)
    Pulse->>CleanupCmd: dispatch new tasks (up to limits)
    Pulse->>CleanupCmd: cleanup completed worktrees
    CleanupCmd->>Database: purge stale PIDs
    CleanupCmd-->>Pulse: cleanup summary
    Pulse-->>Supervisor: pulse report

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

chore: implement agent-review session improvements #27: Enhances Git worktree cleanup detection and removal, directly complementing the new create_task_worktree and cleanup_task_worktree functions.
feat(tools): add worktrunk as default worktree tool with fallback #109: Introduces Worktrunk (wt) CLI integration that is explicitly preferred and used as the primary worktree management tool in the new dispatch lifecycle functions.

Poem

🚀 Dispatch flows through the supervisor's veins,
Workers spin up in parallel chains,
Worktrees bloom, status pulses true,
Lifecycle managed end-to-end anew!
Clean, concurrent, with zero debt due. ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding worker dispatch functionality with worktree isolation to the supervisor script, including the task reference (t128.2).
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/t128-2-worker-dispatch

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-06T04:39:52Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 7 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Fri Feb 6 04:39:45 UTC 2026: Code review monitoring started
Fri Feb 6 04:39:46 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 7
Fri Feb 6 04:39:46 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Fri Feb 6 04:39:48 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 7
VULNERABILITIES: 0

Generated on: Fri Feb 6 04:39:51 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

sonarqubecloud · 2026-02-06T04:40:07Z

Quality Gate passed

Issues
7 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

coderabbitai

Actionable comments posted: 7

🤖 Fix all issues with AI agents

In @.agent/scripts/supervisor-helper.sh:
- Around line 1287-1292: The task is transitioned to "dispatched" via
cmd_transition "$task_id" "dispatched" before the worker is actually started,
which can leave tasks stuck if subsequent commands fail; move the cmd_transition
call so it happens only after the worker is successfully backgrounded and the
PID file (and any worktree setup like mkdir -p) is created, or alternatively add
a rollback that calls cmd_transition back to the previous state on any failure
after the transition; update references to cmd_transition, "$task_id",
"$worktree_path", "$branch_name", and "$log_file" and ensure the PID/daemon
startup block is the gate for the successful transition.
- Around line 1049-1068: The comment for detect_dispatch_mode mentions
"interactive" but the function never returns it; update detect_dispatch_mode to
handle an "interactive" path by first checking if SUPERVISOR_DISPATCH_MODE ==
"interactive" and returning "interactive", and then if not explicitly set,
detect an interactive tty (e.g., using the shell test -t 1 or [[ -t 1 ]]) and
return "interactive" when stdout is a TTY; keep the existing checks for
"headless" and "tabby" and preserve the default "headless" fallback.
- Around line 1614-1626: The loop silences all cmd_dispatch stderr and ignores
non-concurrency failures; update the dispatch logic in the while reading
next_tasks so that you do not redirect stderr to /dev/null, capture the exit
code from cmd_dispatch (dispatch_exit), and then: if dispatch_exit == 2 keep the
existing log_info "Concurrency limit reached, stopping dispatch" and break;
otherwise increment a failed_count (e.g., failed_count=$((failed_count+1))) and
emit a clear log (use log_error or log_info) including the tid and dispatch_exit
to surface the failure; keep incrementing dispatched_count only on success.
Ensure you reference cmd_dispatch, next_tasks, dispatched_count, failed_count
and dispatch_exit in the change.
- Around line 1555-1559: The task state change is being swallowed because tasks
in 'dispatched' are never allowed to move to 'evaluating' and the failing
cmd_transition call is suppressed; update VALID_TRANSITIONS to include the
'dispatched:evaluating' pair (or, alternatively, before invoking cmd_transition
"$tid" "evaluating" add an explicit cmd_transition "$tid" "running" to normalize
state) so the cmd_transition call for evaluating can succeed; ensure the change
touches the VALID_TRANSITIONS array/variable and leave the cmd_transition "$tid"
"evaluating" call as-is (remove the need for the "|| true" suppression if the
transition will succeed).
- Around line 1574-1583: The retry path currently re-transitions tasks to
"dispatched" causing them to be re-evaluated without a new worker; update the
valid transitions map to include a "retrying:queued" transition (add
"retrying:queued" to the transitions array/definition near where transitions are
declared) and change the retry block that calls cmd_transition "$tid"
"dispatched" to instead call cmd_transition "$tid" "queued" so Phase 2 will
re-dispatch the task and create a fresh worker/log; keep the existing retrying
transition to increment counters and preserve the error handling/log_error
behavior for failed re-queue attempts.
- Around line 1456-1463: The handler that detects FULL_LOOP_COMPLETE currently
emits the literal sentinel "no_pr" into outcome_detail via the pr_url variable
(see variable pr_url and the echo "complete:${pr_url:-no_pr}" in
supervisor-helper.sh), which causes the string "no_pr" to be stored as a PR URL;
change the emission to leave the detail empty when no PR is found (e.g., emit
"complete:" or an empty pr field instead of "no_pr") and update the cmd_pulse
invocation (the call that adds --pr-url) to only include the --pr-url flag when
the pr_url value is non-empty so nothing like "no_pr" is ever passed/stored.
Ensure you update both the pr_url assignment/echo in supervisor-helper.sh and
the conditional around the --pr-url argument in cmd_pulse.
- Around line 1310-1321: The Tabby path currently always starts a background
worker because the printf escape is forced to succeed with "|| true", causing
dual execution; remove the unconditional truthy fallback and make the fallback
conditional on the OSC 1337 escape result: call printf '\e]1337;NewTab=%s\a'
"$tab_cmd" without "|| true", capture its exit status (or use a boolean like
opened_tab) and only execute the background subshell "(cd "$worktree_path" &&
"${cmd_parts[@]}" > "$log_file" 2>&1; echo "EXIT:$?" >> "$log_file") &" when the
printf/Tabby open failed; keep variables dispatch_mode, tab_cmd, cmd_parts,
worktree_path and log_file as the referenced symbols to locate and update the
logic.

🧹 Nitpick comments (3)

.agent/scripts/supervisor-helper.sh (3)
1091-1115: worktree_path and log_file parameters are accepted but never used.

Parameters $2 and $3 are passed by cmd_dispatch (Line 1305) but are not referenced inside this function. This is misleading—future contributors may assume the command incorporates them. Either use them or drop them from the signature.
♻️ Proposed fix — remove unused params
 build_dispatch_cmd() {
     local task_id="$1"
-    local worktree_path="$2"
-    local log_file="$3"
-    local ai_cli="$4"
+    local ai_cli="$2"
And update the call site at Line 1305:
-    done < <(build_dispatch_cmd "$task_id" "$worktree_path" "$log_file" "$ai_cli")
+    done < <(build_dispatch_cmd "$task_id" "$ai_cli")
1480-1494: Error-pattern grep scans the entire log — risk of false-positive classification.

Patterns like rate.limit|429 and permission denied|401 are matched anywhere in the log. A transient 429 that was recovered from mid-session would still trigger retry:rate_limited if the worker later exited non-zero without FULL_LOOP_COMPLETE. Consider narrowing the search to the tail of the log (e.g., last 50 lines) to reduce false positives.
♻️ Proposed narrowing
-    if grep -qi 'rate.limit\|429\|too many requests' "$tlog" 2>/dev/null; then
+    if tail -50 "$tlog" | grep -qi 'rate.limit\|429\|too many requests' 2>/dev/null; then
         echo "retry:rate_limited"
         return 0
     fi
 
-    if grep -qi 'permission denied\|unauthorized\|403\|401' "$tlog" 2>/dev/null; then
+    if tail -50 "$tlog" | grep -qi 'permission denied\|unauthorized\|403\|401' 2>/dev/null; then
         echo "blocked:auth_error"
         return 0
     fi
 
-    if grep -qi 'merge conflict\|CONFLICT' "$tlog" 2>/dev/null; then
+    if tail -50 "$tlog" | grep -qi 'merge conflict\|CONFLICT' 2>/dev/null; then
         echo "blocked:merge_conflict"
         return 0
     fi
1236-1257: TOCTOU window in concurrency enforcement.

The concurrency check (cmd_running_count) and the subsequent dispatch + state transition are not atomic. Two concurrent invocations (e.g., parallel cmd_pulse runs or manual + cron overlap) could both pass the check and exceed the limit. Given that the pulse is typically cron-based and single-threaded, this is low risk, but worth noting.

A lightweight mitigation would be a filesystem lock (e.g., flock) around the pulse's dispatch phase.

coderabbitai · 2026-02-06T04:46:52Z