t1259: Fix stale evaluating recovery pattern with pre/post-eval heartbeats#1968
t1259: Fix stale evaluating recovery pattern with pre/post-eval heartbeats#1968marcusquinn merged 2 commits intomainfrom
Conversation
…artbeats (t1259)
Root cause: Phase 0.7 was firing on tasks that completed successfully via the
fast-path (evaluate_worker() returns complete: without calling evaluate_with_ai()).
These tasks had no heartbeat protection beyond the cmd_transition('evaluating')
timestamp. If the pulse was killed between evaluate_worker() returning and
cmd_transition('complete'), the task stayed in evaluating with a stale updated_at.
After 240s (heartbeat_window), Phase 0.7 fired and routed to pr_review unnecessarily.
Three fixes:
1. Pre-evaluation heartbeat: refresh updated_at immediately before evaluate_worker()
so the 240s heartbeat window is anchored to evaluation start, not the earlier
cmd_transition('evaluating') call.
2. Post-evaluation heartbeat: refresh updated_at after evaluate_worker() returns
complete:* to extend the window through the quality gate and cmd_transition call.
3. Increase fast-path grace from 10s to 30s: evaluate_worker() can take 10-30s for
PR discovery via GitHub API. The 10s grace caused false positives when the task
was actively being evaluated but updated_at was 10-30s old.
The heartbeat check in _diagnose_stale_root_cause() (240s window) remains the
primary protection; these changes reduce the frequency of unnecessary recovery
cycles and improve dispatch latency.
WalkthroughThis change updates Changes
Sequence Diagram(s)sequenceDiagram
participant Supervisor
participant Database
participant Evaluator as evaluate_worker()
Supervisor->>Database: _update_task_heartbeat() (set updated_at = now)
Supervisor->>Evaluator: start evaluate_worker()
Evaluator->>Supervisor: running / progress updates (optional)
Evaluator-->>Supervisor: outcome = "complete"
Supervisor->>Database: _update_task_heartbeat() (set updated_at = now)
Supervisor->>Supervisor: apply fast-path grace (30s) / recovery checks
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves an issue where tasks completing via the fast-path were erroneously flagged for recovery due to an outdated heartbeat mechanism. By introducing explicit pre- and post-evaluation heartbeats, the system now accurately reflects the active evaluation state, preventing unnecessary recovery cycles. Additionally, the fast-path grace period has been extended to accommodate the time required for PR discovery, further reducing false positives and improving overall dispatch latency. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Thu Feb 19 14:51:54 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
There was a problem hiding this comment.
Code Review
This pull request effectively addresses a race condition where a recovery mechanism was incorrectly triggered for successfully completed tasks. The introduction of pre- and post-evaluation heartbeats, along with increasing the fast-path grace period, are solid fixes. The code is well-commented, explaining the rationale behind the changes. My suggestion to refactor a small piece of duplicated code to improve maintainability, as noted in my comment, remains valid and aligns with best practices for shell scripts.
Extract duplicate heartbeat SQL into _update_task_heartbeat() helper to satisfy DRY principle flagged by Gemini review on PR #1968. Both pre- and post-evaluation heartbeat calls now use the shared helper.
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Thu Feb 19 15:58:52 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
.agents/scripts/supervisor/pulse.sh (1)
527-530: Add a defensive empty-id guard to_update_task_heartbeat().If
task_idis empty, the query executesWHERE id = ''— a harmless no-op since no task has an empty primary key, but it still incurs an unnecessary DB round-trip. A one-line guard closes the gap and makes the precondition explicit.🛡️ Proposed defensive guard
_update_task_heartbeat() { local task_id="$1" + [[ -z "$task_id" ]] && return 0 db "$SUPERVISOR_DB" "UPDATE tasks SET updated_at = strftime('%Y-%m-%dT%H:%M:%SZ','now') WHERE id = '$(sql_escape "$task_id")';" 2>/dev/null || true }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.agents/scripts/supervisor/pulse.sh around lines 527 - 530, The _update_task_heartbeat function should short-circuit when task_id is empty to avoid a needless DB round-trip; add a one-line guard at the top of _update_task_heartbeat that checks if task_id is empty (e.g. test for zero-length) and returns immediately if so, then proceed with the existing db call and sql_escape usage unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.agents/scripts/supervisor/pulse.sh:
- Around line 527-530: The _update_task_heartbeat function should short-circuit
when task_id is empty to avoid a needless DB round-trip; add a one-line guard at
the top of _update_task_heartbeat that checks if task_id is empty (e.g. test for
zero-length) and returns immediately if so, then proceed with the existing db
call and sql_escape usage unchanged.



Summary
Root cause: Phase 0.7 was firing on tasks that completed successfully via the fast-path (
evaluate_worker()returnscomplete:without callingevaluate_with_ai()). These tasks had no heartbeat protection beyond thecmd_transition('evaluating')timestamp. After 240s, Phase 0.7 fired and routed topr_reviewunnecessarily, adding an extra recovery cycle to every completed task.Three fixes applied:
updated_atimmediately beforeevaluate_worker()so the 240s heartbeat window is anchored to evaluation start, not the earliercmd_transition('evaluating')callupdated_atafterevaluate_worker()returnscomplete:*to extend the window through the quality gate andcmd_transitioncallevaluate_worker()can take 10-30s for PR discovery via GitHub API; the 10s grace caused false positives when the task was actively being evaluatedImpact
pr_reviewcycle_diagnose_stale_root_cause()(240s window) remains the primary protectionFiles Changed
.agents/scripts/supervisor/pulse.sh: Pre/post-eval heartbeats + increased fast-path grace periodRef #1967
Summary by CodeRabbit