fix: resolve clean_exit_no_signal false retry loop (t198)#834
fix: resolve clean_exit_no_signal false retry loop (t198)#834marcusquinn merged 1 commit intomainfrom
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Tue Feb 10 00:02:58 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
Root cause 1: Backend error detection used log_lines which includes REPROMPT METADATA headers (8 lines). Retry logs with backend errors had 12 total lines, exceeding the < 10 threshold. Fix: add content_lines metadata field that excludes metadata headers. Root cause 2: Workers determining a task is already done/obsolete exit cleanly (EXIT:0) with no signal and no PR. The supervisor had no way to distinguish this from 'worker ran out of context'. Fix: detect 'already done' / 'no changes needed' language in the worker's final text output and classify as complete:task_obsolete. Evidence: t135.3/t135.4/t135.5 all showed repeated clean_exit_no_signal retries. Retry logs contained 'All Antigravity endpoints failed' (backend error) but were misclassified. Initial runs showed workers saying 'TASK ALREADY DONE' but being retried.
be07048 to
1f1bee3
Compare
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Tue Feb 10 01:34:34 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|



Summary
Fixes the
clean_exit_no_signalfalse retry loop that wasted retry attempts on tasks that either hit backend errors during retries or were already done/obsolete.Root Causes Found
1. Backend error detection threshold too strict
evaluate_worker()usedlog_lines < 10to detect short backend-error-only logsclean_exit_no_signalcontent_linesmetadata field that excludes REPROMPT METADATA headers2. No detection path for "task already done"
task_obsoletedetection inextract_log_metadatathat checks the workers final text output for explicit "already done" / "no changes needed" language, classified ascomplete:task_obsoleteEvidence
From production supervisor logs:
Changes
.agents/scripts/supervisor-helper.sh: Addedcontent_linesandtask_obsoletemetadata fields inextract_log_metadata(), updatedevaluate_worker()to usecontent_linesfor backend error threshold and detect task obsolescence before falling through toclean_exit_no_signaltests/test-supervisor-state-machine.sh: 4 new tests for t198 scenariostests/test-dispatch-worktree-evaluate.sh: 1 new integration test for backend error with REPROMPT METADATATest Results
All 4 core tests pass:
retry:backend_quota_error(wasclean_exit_no_signal)complete:task_obsolete(wasretry:clean_exit_no_signal)complete:task_obsolete(wasretry:clean_exit_no_signal)retry:clean_exit_no_signal(unchanged, regression safe)