Skip to content

Conversation

@marcusquinn
Copy link
Owner

@marcusquinn marcusquinn commented Feb 6, 2026

Summary

  • Fix 10+ bugs in supervisor-helper.sh found during integration testing with t083-t094 batch (12 SEO subagent tasks, concurrency 4)
  • Add worker process cleanup to prevent orphaned opencode/node processes
  • Add kill-workers command for emergency orphan cleanup
  • Add pre-dispatch model health check with 5-minute cache
  • Add tiered model resolution (coding/eval/health)
  • Add macOS audio alerts (afplay) for task completion/failure
  • Add periodic process hygiene in pulse Phase 4

Integration Test Results: 12/12 COMPLETE

Metric Result
Tasks dispatched 12 (t083-t094)
Completed with PR 12 (100%)
PRs created #382-#391
Total retries 3 (backend infrastructure)
Orphans found/killed 80 processes (first run)
Concurrency enforced Yes (4 max)
TODO.md auto-updated Yes (complete + blocked)
Notifications sent Yes (mail + audio)
Memory patterns stored Yes (success + failure)
Batch retrospective Automatic on completion

Bugs Fixed (10)

  1. bash 3.2 compatibility: Replace local -A associative arrays with grep-based parsing
  2. Unbound variable: dispatched_count referenced before declaration in retry handler
  3. False positive evaluation: Heuristic error patterns matched generated content
  4. PR URL priority: Tasks with PR + clean exit now complete regardless of log patterns
  5. Heuristic gating: Error patterns only checked on non-zero exit code
  6. Stuck evaluating: Pulse includes evaluating state, re-evaluates without AI on timeout
  7. Backend error detection: Antigravity/quota failures detected as transient retries
  8. Backend error ordering: Success signals checked before backend errors (task can recover)
  9. Orphaned processes: 80 accumulated from workers whose parent shells exited
  10. TTS routing: say needs Accessibility perms; simplified to afplay system sounds

New Features

  • cleanup_worker_processes() - kills process trees on task completion
  • _kill_descendants() / _list_descendants() - recursive process tree management
  • kill-workers command - emergency orphan cleanup with safety guards
  • check_model_health() - pre-dispatch probe with 5-minute cache
  • resolve_model() - tiered model selection (coding=opus, eval=sonnet, health=sonnet)
  • notify_batch_progress() - macOS audio alerts on progress milestones
  • Pulse Phase 4 - periodic stale process cleanup every cycle
  • Model passthrough via -m flag to opencode (prevents router picking exhausted providers)

Follow-up Tasks Created

  • t128.8: Post-PR lifecycle (merge, postflight, deploy)
  • t128.9: Agent-review and session-review integration
  • t128.10: Automatic release at batch milestones

Closes t128.7

Summary by CodeRabbit

Release Notes

  • New Features

    • Added kill-workers command for emergency cleanup of orphaned background processes
    • Implemented model health checking and intelligent model selection system
  • Improvements

    • Enhanced error detection with backend error tracking
    • Improved user notifications for cleanup operations and batch progress updates

@gemini-code-assist
Copy link

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 6, 2026

Caution

Review failed

The pull request is closed.

Walkthrough

Adds emergency worker-process cleanup, model selection and health-checks, and threads model hints through dispatch/pulse/evaluation flows; introduces new helper functions and a kill-workers command; enhances log metadata extraction and evaluation logic to handle backend errors and nuanced exit-code outcomes.

Changes

Cohort / File(s) Summary
Supervisor helper (single file)
.agent/scripts/supervisor-helper.sh
Large feature additions and updates within the supervisor helper: emergency cleanup functions (cleanup_worker_processes, _kill_descendants, _list_descendants, cmd_kill_workers) and kill-workers command; model resolution and health checks (resolve_model, check_model_health); build_dispatch_cmd() signature now accepts model; dispatch/pulse/evaluation flows updated to pass and use model hints; enhanced extract_log_metadata and evaluation logic to count backend errors and handle exit codes with retry/blocked logic.

Sequence Diagram

sequenceDiagram
    participant Client as Client/CLI
    participant Supervisor as supervisor-helper.sh
    participant Dispatcher as Dispatch Flow
    participant Pulse as Pulse Phase
    participant Evaluator as evaluate_worker()
    participant LogParser as extract_log_metadata()
    participant HealthSvc as check_model_health()
    participant Worker as Worker Process Tree

    Client->>Supervisor: cmd_dispatch / cmd_pulse / kill-workers
    Supervisor->>HealthSvc: resolve_model() & check_model_health(model)
    HealthSvc-->>Supervisor: model health (ok|unhealthy)
    Supervisor->>Dispatcher: build_dispatch_cmd(model) -> start Worker
    Dispatcher->>Worker: spawn PID
    Pulse->>Evaluator: evaluate_worker(PID, model)
    Evaluator->>LogParser: tail logs (last 20 lines)
    LogParser-->>Evaluator: metadata + backend_error_count + exit_code
    alt backend infra or retryable
        Evaluator->>Supervisor: decide RETRY
        Supervisor->>Worker: cleanup_worker_processes(PID)
        Supervisor->>Worker: _list_descendants / _kill_descendants
    else deterministic complete
        Evaluator-->>Pulse: TASK_COMPLETE
    else unrecoverable
        Evaluator-->>Pulse: TASK_FAILED
    end
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🧹 Small scripts hum where worker trees once grew,
Models checked, health probed, and orphans bid adieu.
PIDs fall gently, logs give whispered clues,
Dispatch marches on with a wiser, calmer muse. ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title directly addresses the main change: supervisor integration testing fixes with process cleanup and retry logic improvements discovered during testing.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/t128-7-integration-testing

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Feb 6, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 20 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Fri Feb 6 05:59:26 UTC 2026: Code review monitoring started
Fri Feb 6 05:59:26 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 20
Fri Feb 6 05:59:26 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Fri Feb 6 05:59:28 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 20
  • VULNERABILITIES: 0

Generated on: Fri Feb 6 05:59:31 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.agent/scripts/supervisor-helper.sh (1)

3446-3464: ⚠️ Potential issue | 🟡 Minor

show_usage missing kill-workers command and updated cleanup description.

The file header (lines 16-17) documents both cleanup ... + processes and kill-workers, but show_usage() still has the old cleanup description and omits kill-workers entirely.

📝 Proposed fix
-  supervisor-helper.sh cleanup [--dry-run]           Clean up completed worktrees
+  supervisor-helper.sh cleanup [--dry-run]           Clean up completed worktrees + processes
+  supervisor-helper.sh kill-workers [--dry-run]      Kill orphaned worker processes (emergency)

Also add an Options for 'kill-workers' section lower in the help text:

Options for 'kill-workers':
  --dry-run              Show orphaned processes without killing them

As per coding guidelines, .agent/scripts/*.sh: "Clear logging and feedback" — the help output is the primary discoverability mechanism for the new emergency command.

🤖 Fix all issues with AI agents
In @.agent/scripts/supervisor-helper.sh:
- Around line 1699-1710: The script deletes "$log_tail_file" before computing
backend_error_count, so the grep against "$log_tail_file" always fails and
backend_error_count is 0; fix by moving the rm -f "$log_tail_file" to after the
backend_error_count assignment (or add an existence check such as [ -f
"$log_tail_file" ] before grepping) so the grep command that sets
backend_error_count (the line using grep -ci 'endpoints
failed\|Antigravity\|gateway.*error\|service unavailable\|503' "$log_tail_file"
...) runs against the actual file and does not get silenced by 2>/dev/null ||
echo 0. Ensure the cleanup removal of "$log_tail_file" happens only after all
counts (rate_limit_count, auth_error_count, conflict_count, timeout_count,
oom_count, backend_error_count) are computed.
🧹 Nitpick comments (2)
.agent/scripts/supervisor-helper.sh (2)

1258-1271: Descendants only receive SIGTERM — no SIGKILL fallback.

cleanup_worker_processes escalates to kill -9 for the parent PID, but _kill_descendants only sends SIGTERM. A stubborn node/opencode child process that traps or ignores SIGTERM will survive cleanup — exactly the orphan scenario this PR aims to eliminate.

♻️ Add SIGKILL fallback for descendants
 _kill_descendants() {
     local parent_pid="$1"
     local children
     children=$(pgrep -P "$parent_pid" 2>/dev/null) || true
 
     if [[ -n "$children" ]]; then
         for child in $children; do
             _kill_descendants "$child"
             kill "$child" 2>/dev/null || true
         done
+        # Brief wait then force-kill any survivors
+        sleep 0.5
+        for child in $children; do
+            kill -0 "$child" 2>/dev/null && kill -9 "$child" 2>/dev/null || true
+        done
     fi
 
     return 0
 }

1688-1690: Consider a trap for temp file cleanup in extract_log_metadata.

If the function exits abnormally between mktemp and rm -f, the temp file leaks. Under set -e, an unguarded failure could trigger this. A local trap would make this bulletproof.

♻️ Trap-based temp file cleanup
     local log_tail_file
     log_tail_file=$(mktemp)
+    trap "rm -f '$log_tail_file'" RETURN
     tail -20 "$log_file" > "$log_tail_file" 2>/dev/null || true

With this, the rm -f "$log_tail_file" on line 1699 can be removed (the trap handles it), which also naturally fixes the ordering bug with backend_error_count.

@github-actions
Copy link

github-actions bot commented Feb 6, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 20 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Fri Feb 6 06:08:26 UTC 2026: Code review monitoring started
Fri Feb 6 06:08:26 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 20
Fri Feb 6 06:08:26 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Fri Feb 6 06:08:28 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 20
  • VULNERABILITIES: 0

Generated on: Fri Feb 6 06:08:31 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@github-actions
Copy link

github-actions bot commented Feb 6, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 20 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Fri Feb 6 06:12:16 UTC 2026: Code review monitoring started
Fri Feb 6 06:12:17 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 20
Fri Feb 6 06:12:17 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Fri Feb 6 06:12:19 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 20
  • VULNERABILITIES: 0

Generated on: Fri Feb 6 06:12:21 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@github-actions
Copy link

github-actions bot commented Feb 6, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 20 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Fri Feb 6 06:17:22 UTC 2026: Code review monitoring started
Fri Feb 6 06:17:23 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 20
Fri Feb 6 06:17:23 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Fri Feb 6 06:17:25 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 20
  • VULNERABILITIES: 0

Generated on: Fri Feb 6 06:17:28 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@github-actions
Copy link

github-actions bot commented Feb 6, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 20 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Fri Feb 6 06:18:27 UTC 2026: Code review monitoring started
Fri Feb 6 06:18:28 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 20
Fri Feb 6 06:18:28 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Fri Feb 6 06:18:30 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 20
  • VULNERABILITIES: 0

Generated on: Fri Feb 6 06:18:33 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@github-actions
Copy link

github-actions bot commented Feb 6, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 20 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Fri Feb 6 06:31:13 UTC 2026: Code review monitoring started
Fri Feb 6 06:31:13 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 20
Fri Feb 6 06:31:13 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Fri Feb 6 06:31:15 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 20
  • VULNERABILITIES: 0

Generated on: Fri Feb 6 06:31:18 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@github-actions
Copy link

github-actions bot commented Feb 6, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 20 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Fri Feb 6 06:38:50 UTC 2026: Code review monitoring started
Fri Feb 6 06:38:50 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 20
Fri Feb 6 06:38:50 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Fri Feb 6 06:38:52 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 20
  • VULNERABILITIES: 0

Generated on: Fri Feb 6 06:38:55 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

…racy

- Replace associative arrays (bash 4+) with grep-based metadata parsing
- Move dispatched_count declaration before Phase 1 (unbound variable fix)
- Add cleanup_worker_processes() to kill process trees on task completion
- Add kill-workers command for emergency orphan cleanup
- Add _list_descendants() for safe process tree protection
- Integrate process cleanup into pulse evaluation and cleanup command
- Prioritize PR URL over heuristic error patterns (prevents false positives)
- Gate heuristic error matching on non-zero exit code only
- Reduce error pattern search to last 20 log lines

Found during t128.7 integration testing: 80 orphaned opencode processes
accumulated from workers whose parent shells exited but child processes
(opencode binary, node MCP servers) kept running as PPID=1 orphans.
Heuristic evaluation produced false positives when task content discussed
authentication/errors (e.g., Bing Webmaster Tools API documentation).
Tasks could get stuck in 'evaluating' state when the AI eval (Tier 3)
timed out. The pulse now includes 'evaluating' in its Phase 1 query
and re-evaluates stuck tasks with --no-ai to avoid repeated timeouts.
Add backend_error_count to extract_log_metadata() for detecting
Antigravity/API gateway failures. These are always transient (provider
overloaded) and should retry, not block. Found during t128.7 when
concurrency 4 overwhelmed the opencode backend.
- Task notifications: sound + banner on complete (Glass), blocked (Basso), failed (Sosumi)
- Batch progress: notification on each pulse with progress (Hero sound on all-complete)
- Integrated into send_task_notification() and pulse summary
- Only fires on Darwin (macOS), no-op on Linux
- afplay needs direct backgrounding (not subshell) to produce sound
- Backend error grep was reading deleted temp file; now searches full log
- say for spoken status, afplay for attention sounds
Add Quota protection, over.*usage, quota reset patterns to backend
error detection. Opencode rotates providers (Antigravity, Gemini) and
all can hit quota limits during high-concurrency batch dispatch.
- Add check_model_health() that probes the model with a trivial prompt
  before creating worktrees or burning retries (15s timeout, 5m cache)
- Pass task model to build_dispatch_cmd() so opencode uses -m flag
  (prevents Antigravity router from picking exhausted providers)
- Return code 3 from dispatch = provider unavailable, stops pulse
  dispatch loop until next cycle (avoids cascading failures)
- Detects: quota exhaustion, endpoint failures, timeouts, empty responses
- Add resolve_model() with tiers: coding (opus), eval (sonnet), health (sonnet)
- Health check uses cheap sonnet probe instead of opus
- AI eval uses resolve_model('eval') instead of hardcoded model
- Pulse Phase 4: periodic cleanup of stale worker processes every cycle
- Model priority: Anthropic SOTA via opencode > claude CLI > zen free
A task can hit a backend error early, recover, and complete with a PR.
Backend error check was running first, causing false retries on
completed tasks. Now ordered: signals > PR URL > backend errors.
Background say/afplay were getting killed when the calling function
returned. nohup detaches them from the process group.
…rms)

say/TTS requires the speech synthesis daemon which macOS Sequoia won't
start for processes without Accessibility permissions. afplay works
reliably for all process contexts. TTS can be re-enabled once Tabby
is added to System Settings > Privacy & Security > Accessibility.
@marcusquinn marcusquinn force-pushed the feature/t128-7-integration-testing branch from 027ec21 to da1193c Compare February 6, 2026 16:49
@marcusquinn marcusquinn merged commit a635c85 into main Feb 6, 2026
7 of 10 checks passed
@marcusquinn marcusquinn deleted the feature/t128-7-integration-testing branch February 6, 2026 16:49
@github-actions
Copy link

github-actions bot commented Feb 6, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 23 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Fri Feb 6 16:49:34 UTC 2026: Code review monitoring started
Fri Feb 6 16:49:34 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 23
Fri Feb 6 16:49:34 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Fri Feb 6 16:49:36 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 23
  • VULNERABILITIES: 0

Generated on: Fri Feb 6 16:49:39 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

sonarqubecloud bot commented Feb 6, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant