Skip to content

t1336: Supervisor self-diagnosis — pipeline health, schema validation, issue tag drift#2276

Merged
marcusquinn merged 1 commit intomainfrom
feature/supervisor-self-diagnosis
Feb 25, 2026
Merged

t1336: Supervisor self-diagnosis — pipeline health, schema validation, issue tag drift#2276
marcusquinn merged 1 commit intomainfrom
feature/supervisor-self-diagnosis

Conversation

@marcusquinn
Copy link
Owner

@marcusquinn marcusquinn commented Feb 25, 2026

Summary

  • Adds three new sanity check sections (6-8) to detect systemic pipeline failures that previously went unnoticed for days
  • Adds Phase 3 throughput metrics to AI health context so the reasoner can see whether lifecycle processing is working
  • Updates sanity prompt to instruct AI about new diagnostic sections with appropriate action recommendations

Changes

sanity-check.sh_build_sanity_state_snapshot()

  • Section 6 (Pipeline Phase Health): Parses supervisor log for Phase 3 evaluated/actioned counts, zero-eval streaks, "could not gather state" error counts, dispatch stalls, and underutilisation events
  • Section 7 (Schema Validation): Runs PRAGMA table_info(tasks) and verifies all required columns exist — catches the exact class of bug from PR fix: Phase 3 AI lifecycle completely broken — gather_task_state references non-existent worker_pid column #2275 where worker_pid was referenced after removal
  • Section 8 (Cross-Repo Issue Tag Truthfulness): Compares GitHub issue labels (status:*) against DB task state for all managed repos, flagging drift

sanity-check.sh_build_sanity_prompt()

  • Added items 6-8 to the "What to Look For" section with guidance on appropriate actions (log_only for pipeline/schema issues, label sync triggers for tag drift)
  • Added example log_only action for pipeline failures

ai-context.shbuild_health_context()

  • Added Phase 3 throughput metrics: last evaluated count, last actioned count, zero-eval rate over last 50 lifecycle entries

Motivation

Phase 3 (ai-lifecycle) was silently broken for days because gather_task_state() referenced a non-existent worker_pid column (fixed in PR #2275). The supervisor had no way to detect this — it just saw 0 tasks evaluated and moved on. These diagnostics ensure such failures are surfaced to the AI sanity checker and reasoner.

Testing

  • ShellCheck clean (zero new violations)
  • Verified against current supervisor.log format

Summary by CodeRabbit

Release Notes

  • New Features

    • Added Phase 3 throughput metrics and performance diagnostics to system health reporting
    • Introduced comprehensive pipeline health validation checks and cross-repository issue verification
  • Improvements

    • Enhanced error handling for missing diagnostic data with clearer reporting
    • Improved organization and readability of diagnostic output sections

… issue tag drift detection (t1336)

Add three new sanity check sections to detect systemic pipeline failures:
- Section 6: Pipeline phase health — detects Phase 3 zero-eval streaks,
  gather_task_state errors, dispatch stalls, underutilisation
- Section 7: Schema validation — verifies required columns exist in tasks
  table via PRAGMA (catches the exact worker_pid bug from PR #2275)
- Section 8: Cross-repo issue tag truthfulness — compares GH issue labels
  against DB state for all managed repos

Update sanity prompt to instruct AI about new sections 6-8 with
appropriate action recommendations.

Add Phase 3 throughput metrics (last eval/actioned, zero-eval rate) to
build_health_context() so the AI reasoner has visibility into whether
the lifecycle phase is actually working.
@gemini-code-assist
Copy link

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@marcusquinn marcusquinn merged commit 3a2e814 into main Feb 25, 2026
4 of 5 checks passed
@marcusquinn marcusquinn deleted the feature/supervisor-self-diagnosis branch February 25, 2026 03:46
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 25, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bb85ed2 and 331c521.

📒 Files selected for processing (2)
  • .agents/scripts/supervisor/ai-context.sh
  • .agents/scripts/supervisor/sanity-check.sh

Walkthrough

Two shell scripts in the supervisor automation toolkit were enhanced with expanded diagnostics. The ai-context.sh script now extracts Phase 3 throughput metrics from supervisor logs, while sanity-check.sh was significantly restructured to include pipeline health validation, database schema verification, and cross-repository GitHub issue tag consistency checks.

Changes

Cohort / File(s) Summary
Phase 3 Metrics Extraction
.agents/scripts/supervisor/ai-context.sh
Added block to parse supervisor.log for "ai-lifecycle...evaluated...actioned" entries, extracting last evaluation and actioned timestamps, computing zero-eval rate from recent lifecycle entries, with graceful handling when log file is absent.
Diagnostic Pipeline Refactoring
.agents/scripts/supervisor/sanity-check.sh
Replaced Identity context block with multi-section validation scaffold: Pipeline Phase Health (Phase 3 ai-lifecycle metrics, stall/underutilization warnings), Schema Validation (checks DB task schema columns), and Cross-Repo Issue Tag Truthfulness (validates GitHub label drift). Reordered sections and expanded error handling for missing configuration data.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🔍 Logs now whisper their secrets true,
Phase 3 metrics emerging into view,
Schema checked, GitHub labels aligned,
Pipeline health metrics, clearly defined,
Diagnostics sharp—no drift remains behind! ✨

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/supervisor-self-diagnosis

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 56 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 25 03:46:41 UTC 2026: Code review monitoring started
Wed Feb 25 03:46:42 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 56

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 56
  • VULNERABILITIES: 0

Generated on: Wed Feb 25 03:46:44 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant