fix(supervisor): evaluator misclassifies EXIT:0 as backend error + diagnostic subtask newline parsing by marcusquinn · Pull Request #490 · marcusquinn/aidevops

marcusquinn · 2026-02-07T23:31:27Z

Summary

Bug 1: evaluate_worker() checked backend_error_count > 0 before the exit_code == 0 clean-exit path, causing tasks with clean exits but backend error strings in log content (e.g., documenting APIs, discussing error handling) to be misclassified as retry:backend_infrastructure_error. Moved backend error check inside the non-zero exit code guard (Tier 2 heuristics). Also moved backend_error_count grep from full-log to tail-only search with anchored 503 pattern (consistent with health check fix in PR fix(supervisor): health check 503 false positive on JSON timestamps #488).
Bug 2: create_diagnostic_subtask() embedded raw log tail (containing newlines) into the task description stored in SQLite. When cmd_next returned this as tab-separated output parsed with while read, embedded newlines like EXIT:0 and DIAGNOSTIC_CONTEXT_END were parsed as separate task rows with malformed IDs. Fixed by replacing newlines/tabs with spaces before storage.
Defense: Added task ID validation guard in both pulse dispatch loops to skip malformed IDs (empty, containing spaces/colons, non-alphanumeric characters).

Testing

All 53 state machine tests pass
55/56 batch quality tests pass (1 expected: deployed version differs from worktree)
ShellCheck zero violations
bash -n syntax check passes

Fixes bugs discovered during batch-20260207 supervisor run (memories: mem_20260207230227_00095302).

…agnostic subtask newline parsing Bug 1: evaluate_worker() checked backend_error_count before exit_code==0, causing tasks with clean exits but backend error strings in log content to be classified as retry:backend_infrastructure_error. Moved backend error check inside the non-zero exit code guard (Tier 2 heuristics). Also moved backend_error_count grep to tail-only search with anchored 503 pattern (consistent with health check fix in PR #488). Bug 2: create_diagnostic_subtask() embedded raw log tail (with newlines) into the task description stored in SQLite. When cmd_next returned this as tab-separated output, embedded newlines caused EXIT:0 and DIAGNOSTIC_CONTEXT_END to be parsed as separate task IDs. Fixed by replacing newlines/tabs with spaces in failure_context before storage. Defense: Added task ID validation guard in both pulse dispatch loops to skip malformed IDs (empty, containing spaces/colons, non-alphanumeric).

gemini-code-assist · 2026-02-07T23:31:31Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

coderabbitai · 2026-02-07T23:31:35Z

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 10 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch bugfix/supervisor-eval-and-diag-parsing

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-07T23:32:01Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 31 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sat Feb 7 23:31:57 UTC 2026: Code review monitoring started
Sat Feb 7 23:31:57 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 31
Sat Feb 7 23:31:57 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Sat Feb 7 23:31:59 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 31
VULNERABILITIES: 0

Generated on: Sat Feb 7 23:32:00 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

sonarqubecloud · 2026-02-07T23:32:36Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

marcusquinn merged commit 9567f01 into main Feb 7, 2026
11 checks passed

marcusquinn added a commit that referenced this pull request Feb 8, 2026

chore: mark t135.1, t135.2, t135.8 complete in TODO.md (PRs #490-#493)

6a07ac7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(supervisor): evaluator misclassifies EXIT:0 as backend error + diagnostic subtask newline parsing#490

fix(supervisor): evaluator misclassifies EXIT:0 as backend error + diagnostic subtask newline parsing#490
marcusquinn merged 1 commit intomainfrom
bugfix/supervisor-eval-and-diag-parsing

marcusquinn commented Feb 7, 2026

Uh oh!

gemini-code-assist bot commented Feb 7, 2026

Uh oh!

coderabbitai bot commented Feb 7, 2026

Rate limit exceeded

Uh oh!

github-actions bot commented Feb 7, 2026

Uh oh!

sonarqubecloud bot commented Feb 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

marcusquinn commented Feb 7, 2026

Summary

Testing

Uh oh!

gemini-code-assist bot commented Feb 7, 2026

Uh oh!

coderabbitai bot commented Feb 7, 2026

Rate limit exceeded

Uh oh!

github-actions bot commented Feb 7, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

sonarqubecloud bot commented Feb 7, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant