Skip to content

fix(supervisor): evaluator misclassifies EXIT:0 as backend error + diagnostic subtask newline parsing#490

Merged
marcusquinn merged 1 commit intomainfrom
bugfix/supervisor-eval-and-diag-parsing
Feb 7, 2026
Merged

fix(supervisor): evaluator misclassifies EXIT:0 as backend error + diagnostic subtask newline parsing#490
marcusquinn merged 1 commit intomainfrom
bugfix/supervisor-eval-and-diag-parsing

Conversation

@marcusquinn
Copy link
Owner

Summary

  • Bug 1: evaluate_worker() checked backend_error_count > 0 before the exit_code == 0 clean-exit path, causing tasks with clean exits but backend error strings in log content (e.g., documenting APIs, discussing error handling) to be misclassified as retry:backend_infrastructure_error. Moved backend error check inside the non-zero exit code guard (Tier 2 heuristics). Also moved backend_error_count grep from full-log to tail-only search with anchored 503 pattern (consistent with health check fix in PR fix(supervisor): health check 503 false positive on JSON timestamps #488).

  • Bug 2: create_diagnostic_subtask() embedded raw log tail (containing newlines) into the task description stored in SQLite. When cmd_next returned this as tab-separated output parsed with while read, embedded newlines like EXIT:0 and DIAGNOSTIC_CONTEXT_END were parsed as separate task rows with malformed IDs. Fixed by replacing newlines/tabs with spaces before storage.

  • Defense: Added task ID validation guard in both pulse dispatch loops to skip malformed IDs (empty, containing spaces/colons, non-alphanumeric characters).

Testing

  • All 53 state machine tests pass
  • 55/56 batch quality tests pass (1 expected: deployed version differs from worktree)
  • ShellCheck zero violations
  • bash -n syntax check passes

Fixes bugs discovered during batch-20260207 supervisor run (memories: mem_20260207230227_00095302).

…agnostic subtask newline parsing

Bug 1: evaluate_worker() checked backend_error_count before exit_code==0,
causing tasks with clean exits but backend error strings in log content
to be classified as retry:backend_infrastructure_error. Moved backend
error check inside the non-zero exit code guard (Tier 2 heuristics).
Also moved backend_error_count grep to tail-only search with anchored
503 pattern (consistent with health check fix in PR #488).

Bug 2: create_diagnostic_subtask() embedded raw log tail (with newlines)
into the task description stored in SQLite. When cmd_next returned this
as tab-separated output, embedded newlines caused EXIT:0 and
DIAGNOSTIC_CONTEXT_END to be parsed as separate task IDs. Fixed by
replacing newlines/tabs with spaces in failure_context before storage.

Defense: Added task ID validation guard in both pulse dispatch loops to
skip malformed IDs (empty, containing spaces/colons, non-alphanumeric).
@gemini-code-assist
Copy link

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 7, 2026

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 10 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bugfix/supervisor-eval-and-diag-parsing

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Feb 7, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 31 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sat Feb 7 23:31:57 UTC 2026: Code review monitoring started
Sat Feb 7 23:31:57 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 31
Sat Feb 7 23:31:57 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Sat Feb 7 23:31:59 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 31
  • VULNERABILITIES: 0

Generated on: Sat Feb 7 23:32:00 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

sonarqubecloud bot commented Feb 7, 2026

@marcusquinn marcusquinn merged commit 9567f01 into main Feb 7, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant