fix(supervisor): detect gh auth failures to prevent clean_exit_no_signal retry waste (t198) by marcusquinn · Pull Request #924 · marcusquinn/aidevops

marcusquinn · 2026-02-10T13:19:19Z

Summary

Root cause analysis of 560 worker logs identified that 79% of clean_exit_no_signal cases are workers that completed work but couldn't push/create PR due to expired GitHub auth
These burned 3 retries each at Opus cost (~$15-30 per task) with identical failures
Added blocked:gh_auth_expired detection so auth failures block immediately instead of wasting retries

Root Cause Analysis

Category	Count	%	Status
Backend errors (short logs)	169	30%	Already handled
PR created, no signal	127	23%	Already handled (t192)
Success with signal	92	16%	Working correctly
No EXIT line (killed)	65	12%	Edge case
clean_exit_no_signal candidates	48	9%	This PR
Short logs (other)	24	4%	Edge case
Non-zero exit	22	4%	Already handled

Of the 48 clean_exit_no_signal candidates:

38 (79%): GitHub auth expired — worker completed work, couldn't push → NOW BLOCKED
7: Context exhaustion / short logs — legitimate retries
2: Task already done — handled by task_obsolete detection
1: Shell unavailable

Changes

extract_log_metadata(): Added gh_auth_failure detection scanning full log for specific gh auth patterns
evaluate_worker(): Added blocked:gh_auth_expired check before clean_exit_no_signal fallback
Tests: 4 new test cases in both test suites (all passing, zero regressions)

Testing

ShellCheck: zero new violations
test-supervisor-state-machine.sh: 4 new tests pass, 13 pre-existing failures unchanged
test-dispatch-worktree-evaluate.sh: 1 new test passes, 12 pre-existing failures unchanged

Closes #817

Summary by CodeRabbit

Release Notes

New Features
- Added detection for GitHub authentication failures in task logs
- Task evaluation now recognizes expired or invalid GitHub credentials and prevents unnecessary retries
Tests
- Added test coverage for GitHub authentication failure detection and blocking behavior

…nal retry waste (t198) Root cause analysis of 560 worker logs found 79% (38/48) of clean_exit_no_signal cases are workers that completed work but couldn't push/create PR due to expired GitHub auth. These burned 3 retries each at Opus cost with identical failures. Changes: - extract_log_metadata(): add gh_auth_failure detection scanning full log for specific gh auth patterns (safe from false positives — patterns are specific) - evaluate_worker(): add blocked:gh_auth_expired check before clean_exit_no_signal fallback, so auth failures block immediately instead of wasting retries - Tests: 4 new test cases covering gh auth failure detection in both test suites Log analysis breakdown (560 total logs): - 169 backend errors (short logs, already handled) - 127 PR created but no signal (already handled by t192) - 92 success with signal - 57 clean_exit_no_signal candidates (48 after filtering) - 38 gh auth failures (NOW BLOCKED instead of retried) - 2 task already done (handled by task_obsolete detection) - 7 context exhaustion / short logs (legitimate retries) - 1 shell unavailable

coderabbitai · 2026-02-10T13:19:37Z

Walkthrough

This PR adds GitHub authentication failure detection to supervisor task evaluation logic. The extract_log_metadata function now scans task logs for authentication-related error phrases and exposes a gh_auth_failure flag. The evaluate_worker function uses this flag to immediately return blocked:gh_auth_expired verdict when authentication is invalid, preventing unnecessary retries. Supporting integration and state machine tests validate the detection and blocking behavior.

Changes

Cohort / File(s)	Summary
Supervisor Helper Script `.agents/scripts/supervisor-helper.sh`	Added GitHub authentication failure detection in `extract_log_metadata` to scan logs for auth-related phrases and expose `gh_auth_failure` metadata flag. Modified `evaluate_worker` to check this flag and return `blocked:gh_auth_expired` verdict when authentication fails, short-circuiting further evaluation and preventing task retries.
Dispatch Worktree Test `tests/test-dispatch-worktree-evaluate.sh`	Added integration test section simulating worker completion with expired GitHub authentication. Test validates that evaluation produces `blocked:gh_auth_expired` verdict and includes assertion logic for auth failure detection.
Supervisor State Machine Test `tests/test-supervisor-state-machine.sh`	Added two test scenarios (t198e, t198f) to verify behavior when GitHub authentication fails during task evaluation. Both test cases assert that authentication failures trigger `blocked:gh_auth_expired` verdict rather than retry logic.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related issues

t198: Investigate clean_exit_no_signal retry pattern — root cause analysis #817: This PR directly addresses the root cause analysis for the clean_exit_no_signal retry pattern by implementing GitHub authentication failure detection and blocking, preventing spurious retries when tasks fail due to expired credentials rather than legitimate worker issues.

Possibly related PRs

feat(supervisor): eager orphaned PR scan after worker evaluation (t216) #917: Modifies evaluation flow in supervisor-helper.sh by integrating an orphan-PR scan immediately after evaluate_worker, complementary to this PR's auth-failure detection logic.
fix: validate PR title/branch contains task ID before attribution (t195) #826: Also modifies evaluate_worker in supervisor-helper.sh to add PR-ownership validation, touching the same evaluation decision logic.
fix: supervisor self-healing -- macOS timeout, PR detection, model names, stale PID cleanup #429: Extends evaluate_worker with PR-URL/branch lookup functionality, sharing the same evaluation decision surface as this PR's auth-failure detection.

Poem

🔐 A shield guards the gates, where credentials once fell,
Auth whispers expose their expired tales to tell,
No retry shall follow what GitHub denied,
Blocked verdicts rise swift—let stale tokens subside. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: detecting GitHub auth failures to prevent wasteful retries in the clean_exit_no_signal pattern.
Linked Issues check	✅ Passed	All coding requirements from issue `#817` are met: gh_auth_failure detection added to extract_log_metadata(), blocked:gh_auth_expired check added to evaluate_worker(), and comprehensive test coverage provided.
Out of Scope Changes check	✅ Passed	All changes directly address issue `#817` objectives; no out-of-scope modifications detected across supervisor helper, dispatch tests, and state-machine tests.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/t198

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-02-10T13:19:41Z

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves resource efficiency by addressing a common cause of wasted computational cycles: tasks failing due to expired GitHub authentication. By implementing proactive detection of these authentication issues, the system now immediately blocks such tasks, preventing unnecessary retries and associated costs. This change allows for more efficient resolution of credential problems and optimizes overall task processing.

Highlights

GitHub Auth Failure Detection: Implemented logic to detect GitHub authentication failures in worker logs by scanning for specific patterns.
Prevented Wasted Retries: Introduced a mechanism to immediately block tasks with GitHub authentication failures, preventing costly and repetitive retries.
New Test Cases: Added four new test cases across two test suites to validate the new detection and blocking logic for GitHub authentication issues.

Changelog

.agents/scripts/supervisor-helper.sh
- Added gh_auth_failure variable and grep logic within extract_log_metadata() to identify GitHub authentication failure patterns in worker logs.
- Introduced a conditional check in evaluate_worker() to set the task status to blocked:gh_auth_expired when a clean exit occurs with no signal but GitHub authentication failure is detected.
tests/test-dispatch-worktree-evaluate.sh
- Added a new integration test case (integ-t198b) to simulate a worker completing its work but failing to push due to expired GitHub authentication, verifying the blocked:gh_auth_expired status.
tests/test-supervisor-state-machine.sh
- Added two new test cases (eval-t198e and eval-t198f) to confirm that tasks exhibiting various GitHub authentication failure messages are correctly identified and blocked as blocked:gh_auth_expired.

Activity

No human activity (comments, reviews) has been recorded on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2026-02-10T13:19:57Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 34 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Tue Feb 10 13:19:52 UTC 2026: Code review monitoring started
Tue Feb 10 13:19:53 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 34

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 34
VULNERABILITIES: 0

Generated on: Tue Feb 10 13:19:56 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

sonarqubecloud · 2026-02-10T13:20:30Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @.agents/scripts/supervisor-helper.sh:
- Around line 4769-4780: The current full-log grep that sets
gh_auth_failure=true is too broad and matches generic phrases; update the grep
in the block that defines gh_auth_failure to restrict matches to explicit auth
failure lines only (references: variable gh_auth_failure, the grep invocation
using "$log_file"): match patterns like '^.*gh auth (status|login).*fail',
'(^|[:space])gh:', 'fatal: Authentication failed', 'remote: Permission denied',
and 'authentication token.*expired' (case-insensitive), and remove or
de-prioritize generic phrases such as "not logged in" or "failed to
authenticate" unless they appear in those auth-specific contexts; keep the
2>/dev/null and echo "gh_auth_failure=$gh_auth_failure" behavior.

coderabbitai · 2026-02-10T13:22:34Z

.agents/scripts/supervisor-helper.sh

+    # GitHub auth failure detection (t198): workers that complete their work but
+    # can't push/create a PR because gh auth is expired. This is the #1 cause of
+    # clean_exit_no_signal (79% of cases in production logs). Without this check,
+    # the supervisor retries them 3x at Opus cost, each failing identically.
+    # Search the FULL log (not just tail) because auth failures appear in tool
+    # output mid-log, not at the end. This is safe because gh auth patterns are
+    # specific enough to avoid false positives from documentation content.
+    local gh_auth_failure="false"
+    if grep -qiE 'gh auth (status|login).*fail|authentication token.*expired|not logged in|try authenticating|gh: To use .* in a non-interactive context|could not authenticate|failed to authenticate' "$log_file" 2>/dev/null; then
+        gh_auth_failure="true"
+    fi
+    echo "gh_auth_failure=$gh_auth_failure"


⚠️ Potential issue | 🟠 Major

Tighten auth-failure regex to avoid false positives in full-log scan.

Generic phrases like “not logged in” can appear in docs or code samples within logs, which would incorrectly block a task. Scope matches to gh/git auth error lines (e.g., gh auth, gh:, fatal: Authentication failed, remote: Permission denied) before setting gh_auth_failure=true.

Suggested refinement

- if grep -qiE 'gh auth (status|login).*fail|authentication token.*expired|not logged in|try authenticating|gh: To use .* in a non-interactive context|could not authenticate|failed to authenticate' "$log_file" 2>/dev/null; then + if grep -qiE '(gh auth (status|login).*(fail|not logged in|authenticate)|gh:.*(not logged in|authenticate)|GitHub CLI.*(not logged in|authenticate)|fatal:.*authentication failed|remote:.*(authentication|permission) denied)' "$log_file" 2>/dev/null; then gh_auth_failure="true" fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# GitHub auth failure detection (t198): workers that complete their work but

# can't push/create a PR because gh auth is expired. This is the #1 cause of

# clean_exit_no_signal (79% of cases in production logs). Without this check,

# the supervisor retries them 3x at Opus cost, each failing identically.

# Search the FULL log (not just tail) because auth failures appear in tool

# output mid-log, not at the end. This is safe because gh auth patterns are

# specific enough to avoid false positives from documentation content.

local gh_auth_failure="false"

if grep -qiE 'gh auth (status|login).*fail|authentication token.*expired|not logged in|try authenticating|gh: To use .* in a non-interactive context|could not authenticate|failed to authenticate' "$log_file" 2>/dev/null; then

gh_auth_failure="true"

fi

echo "gh_auth_failure=$gh_auth_failure"

# GitHub auth failure detection (t198): workers that complete their work but

# can't push/create a PR because gh auth is expired. This is the `#1` cause of

# clean_exit_no_signal (79% of cases in production logs). Without this check,

# the supervisor retries them 3x at Opus cost, each failing identically.

# Search the FULL log (not just tail) because auth failures appear in tool

# output mid-log, not at the end. This is safe because gh auth patterns are

# specific enough to avoid false positives from documentation content.

local gh_auth_failure="false"

if grep -qiE '(gh auth (status|login).*(fail|not logged in|authenticate)|gh:.*(not logged in|authenticate)|GitHub CLI.*(not logged in|authenticate)|fatal:.*authentication failed|remote:.*(authentication|permission) denied)' "$log_file" 2>/dev/null; then

gh_auth_failure="true"

fi

echo "gh_auth_failure=$gh_auth_failure"

🤖 Prompt for AI Agents

In @.agents/scripts/supervisor-helper.sh around lines 4769 - 4780, The current full-log grep that sets gh_auth_failure=true is too broad and matches generic phrases; update the grep in the block that defines gh_auth_failure to restrict matches to explicit auth failure lines only (references: variable gh_auth_failure, the grep invocation using "$log_file"): match patterns like '^.*gh auth (status|login).*fail', '(^|[:space])gh:', 'fatal: Authentication failed', 'remote: Permission denied', and 'authentication token.*expired' (case-insensitive), and remove or de-prioritize generic phrases such as "not logged in" or "failed to authenticate" unless they appear in those auth-specific contexts; keep the 2>/dev/null and echo "gh_auth_failure=$gh_auth_failure" behavior.

gemini-code-assist

Code Review

This pull request effectively addresses a significant source of wasted retries by detecting GitHub authentication failures early and blocking the task. The logic is sound and the implementation is clean. The addition of new tests to cover this case is also a great improvement. I have one minor suggestion to better align with the repository's shell scripting style guide.

gemini-code-assist · 2026-02-10T13:23:05Z

.agents/scripts/supervisor-helper.sh

+    if grep -qiE 'gh auth (status|login).*fail|authentication token.*expired|not logged in|try authenticating|gh: To use .* in a non-interactive context|could not authenticate|failed to authenticate' "$log_file" 2>/dev/null; then
+        gh_auth_failure="true"
+    fi


This use of 2>/dev/null to suppress grep's error output violates the repository's style guide (line 50), which states that 2>/dev/null is only acceptable when redirecting to log files. A safer and more idiomatic way to handle cases where the log file might not exist is to check for the file's existence before calling grep.

Suggested change

if grep -qiE 'gh auth (status|login).*fail|authentication token.*expired|not logged in|try authenticating|gh: To use .* in a non-interactive context|could not authenticate|failed to authenticate' "$log_file" 2>/dev/null; then

gh_auth_failure="true"

fi

if [[ -f "$log_file" ]] && grep -qiE 'gh auth (status|login).*fail|authentication token.*expired|not logged in|try authenticating|gh: To use .* in a non-interactive context|could not authenticate|failed to authenticate' "$log_file"; then

gh_auth_failure="true"

fi

References

The style guide at line 50 bans blanket suppression of stderr with 2>/dev/null, allowing it only when redirecting to log files. The current code uses it to hide potential 'file not found' errors from grep. ^(link)

marcusquinn · 2026-02-10T13:36:26Z

Closing: t198 was already completed in PR #834 (merged 2026-02-10). This PR was created by a re-dispatch of an already-completed task.

coderabbitai bot requested changes Feb 10, 2026

View reviewed changes

gemini-code-assist bot reviewed Feb 10, 2026

View reviewed changes

marcusquinn closed this Feb 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(supervisor): detect gh auth failures to prevent clean_exit_no_signal retry waste (t198)#924

fix(supervisor): detect gh auth failures to prevent clean_exit_no_signal retry waste (t198)#924
marcusquinn wants to merge 1 commit intomainfrom
feature/t198

marcusquinn commented Feb 10, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 10, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Feb 10, 2026

Uh oh!

github-actions bot commented Feb 10, 2026

Uh oh!

sonarqubecloud bot commented Feb 10, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 10, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 10, 2026

Uh oh!

marcusquinn commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

marcusquinn commented Feb 10, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause Analysis

Changes

Testing

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Uh oh!

gemini-code-assist bot commented Feb 10, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions bot commented Feb 10, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

sonarqubecloud bot commented Feb 10, 2026

Quality Gate passed

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

marcusquinn commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

marcusquinn commented Feb 10, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 10, 2026 •

edited

Loading