-
Notifications
You must be signed in to change notification settings - Fork 5
feat: prevent false task completion cascade (t163) #620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- AGENTS.md + plans.md: add task completion rules (never mark [x] unless you wrote the code, marking [x] triggers issue auto-close cascade) - issue-sync-helper.sh: cmd_close() now requires verified: field to close issues, --force flag to override (prevents false closure cascade) - supervisor-helper.sh: add Phase 6 verify step - checks PR is merged before marking deliverable as verified - code-quality.yml: add TODO.md validation step on PRs - warns when tasks marked [x] without verified: field Addresses root cause of 38 issues being falsely auto-closed (PR #616).
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughThe PR implements a comprehensive task completion validation framework spanning documentation, shell automation, and CI/CD. Changes establish explicit task completion rules with verification requirements, introduce a Changes
Sequence Diagram(s)sequenceDiagram
participant Worker as Worker/User
participant TODO as TODO.md
participant GitHub as GitHub Actions
participant IssueSync as issue-sync-helper
participant Supervisor as supervisor-helper
participant GH_API as GitHub API
Worker->>TODO: Mark task complete (check ☑)
Worker->>TODO: Add verified: timestamp
Worker->>GitHub: Push commit
GitHub->>GitHub: Trigger code-quality workflow
GitHub->>TODO: Validate TODO.md diff
GitHub->>GitHub: Check verified: field present
alt Missing verified: field
GitHub->>GitHub: Emit warning (non-blocking)
end
GitHub->>IssueSync: PR merged, trigger sync
IssueSync->>TODO: Check task status
alt No verified: date (without --force)
IssueSync->>IssueSync: Skip task, count skipped
else verified: date present or --force
IssueSync->>GitHub: Close/sync issue
end
Supervisor->>Supervisor: Run Phase 6 (Verify Deliverables)
Supervisor->>TODO: Find deployed/merged tasks
Supervisor->>GH_API: Check PR merge status
alt PR is MERGED
Supervisor->>TODO: Append verified timestamp to notes
Supervisor->>Supervisor: Increment verified counter
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @alex-solovyev, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces robust mechanisms to prevent the erroneous auto-closure of GitHub issues by implementing stricter verification steps for task completion. It updates documentation to guide agents on proper task marking, modifies the issue synchronization script to enforce explicit verification, and enhances the supervisor to automatically confirm deliverables based on PR merge status. These changes aim to improve the reliability of the task management system and avoid cascades of false completions. Highlights
Changelog
Ignored Files
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sun Feb 8 14:32:58 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces important safeguards to prevent tasks from being closed prematurely, addressing a key issue of false completion cascades. The changes are well-structured, with new documentation in AGENTS.md and plans.md to guide users, a verified: field requirement in issue-sync-helper.sh, and a new verification phase in the supervisor that checks PR merge status.
My review includes one suggestion for supervisor-helper.sh to improve the efficiency and clarity of the database queries in the new verification logic. Overall, this is a solid improvement to the workflow's reliability.
.agents/scripts/supervisor-helper.sh
Outdated
| merged_unverified=$(db "$SUPERVISOR_DB" "SELECT id FROM tasks WHERE status = 'deployed' AND id NOT IN (SELECT id FROM tasks WHERE notes LIKE '%verified:%');" 2>/dev/null || echo "") | ||
| if [[ -n "$merged_unverified" ]]; then | ||
| while IFS= read -r verify_tid; do | ||
| [[ -n "$verify_tid" ]] || continue | ||
| local verify_pr_url | ||
| verify_pr_url=$(db "$SUPERVISOR_DB" "SELECT pr_url FROM tasks WHERE id = '$(sql_escape "$verify_tid")';" 2>/dev/null || echo "") | ||
| local verify_worktree | ||
| verify_worktree=$(db "$SUPERVISOR_DB" "SELECT worktree FROM tasks WHERE id = '$(sql_escape "$verify_tid")';" 2>/dev/null || echo "") | ||
|
|
||
| # Check PR is actually merged | ||
| local pr_merged="false" | ||
| if [[ -n "$verify_pr_url" ]]; then | ||
| local pr_number | ||
| pr_number=$(echo "$verify_pr_url" | grep -oE '[0-9]+$' || echo "") | ||
| if [[ -n "$pr_number" ]]; then | ||
| local pr_state | ||
| pr_state=$(gh pr view "$pr_number" --json state --jq '.state' 2>/dev/null || echo "") | ||
| if [[ "$pr_state" == "MERGED" ]]; then | ||
| pr_merged="true" | ||
| fi | ||
| fi | ||
| fi | ||
|
|
||
| if [[ "$pr_merged" == "true" ]]; then | ||
| # PR is merged — deliverable verified | ||
| db "$SUPERVISOR_DB" "UPDATE tasks SET notes = COALESCE(notes, '') || ' verified:$(date +%Y-%m-%d)' WHERE id = '$(sql_escape "$verify_tid")';" 2>/dev/null || true | ||
| log_info " Verified: $verify_tid (PR merged)" | ||
| verified_count=$((verified_count + 1)) | ||
| else | ||
| log_verbose " Unverified: $verify_tid (PR not merged or not found)" | ||
| fi | ||
| done <<< "$merged_unverified" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation for verifying deliverables has a few areas for improvement:
- Inefficient Database Access: It fetches task IDs first and then, inside a loop, fetches the
pr_urlfor each task individually. This results in N+1 queries where N is the number of unverified tasks. - Unused Variable: The
verify_worktreevariable is fetched from the database on line 4867 but is never used. - Query Clarity: The
id NOT IN (SELECT ...)subquery can be less clear and potentially less performant than a directORcondition, especially concerningNULLvalues in thenotescolumn.
I suggest refactoring this block to fetch all necessary data in a single, more efficient query. This also removes the unused variable and improves the query's readability.
merged_unverified=$(db "$SUPERVISOR_DB" "SELECT id, pr_url FROM tasks WHERE status = 'deployed' AND (notes IS NULL OR notes NOT LIKE '%verified:%');" -separator $'\t' 2>/dev/null || echo "")
if [[ -n "$merged_unverified" ]]; then
while IFS=$'\t' read -r verify_tid verify_pr_url; do
[[ -n "$verify_tid" ]] || continue
# Check PR is actually merged
local pr_merged="false"
if [[ -n "$verify_pr_url" ]]; then
local pr_number
pr_number=$(echo "$verify_pr_url" | grep -oE '[0-9]+$' || echo "")
if [[ -n "$pr_number" ]]; then
local pr_state
pr_state=$(gh pr view "$pr_number" --json state --jq '.state' 2>/dev/null || echo "")
if [[ "$pr_state" == "MERGED" ]]; then
pr_merged="true"
fi
fi
fi
if [[ "$pr_merged" == "true" ]]; then
# PR is merged — deliverable verified
db "$SUPERVISOR_DB" "UPDATE tasks SET notes = COALESCE(notes, '') || ' verified:$(date +%Y-%m-%d)' WHERE id = '$(sql_escape "$verify_tid")';" 2>/dev/null || true
log_info " Verified: $verify_tid (PR merged)"
verified_count=$((verified_count + 1))
else
log_verbose " Unverified: $verify_tid (PR not merged or not found)"
fi
done <<< "$merged_unverified"
fiThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🤖 Fix all issues with AI agents
In @.agents/AGENTS.md:
- Around line 107-112: Add a single-line pointer immediately after the "Task
completion rules:" bullet block that points readers to the authoritative,
expanded guidance in workflows/plans.md by referencing its "Task Completion
Rules" section; insert the suggested one-line sentence (e.g., "_See
workflows/plans.md \"Task Completion Rules\" for detailed guidance including
citation requirements._") so AGENTS.md keeps the concise safety-critical summary
while explicitly directing readers to the full policy in plans.md.
In @.agents/scripts/issue-sync-helper.sh:
- Line 1032: Update the close-comment to reflect when verification was skipped:
change the code around the gh issue close call that currently uses issue_number,
repo_slug and task_id so it chooses the message based on the
force/skip-verification flag (the variable your script sets when --force is
passed); if force/skip is true omit or replace "(verified)" (e.g. "Completed.
Task $task_id marked done in TODO.md (forced, verification skipped)." or
"Completed. Task $task_id marked done in TODO.md.") and use that computed
message in the gh issue close command instead of the hard-coded "(verified)"
string.
In @.agents/scripts/supervisor-helper.sh:
- Around line 4859-4886: The migration is missing the tasks.notes column so the
UPDATE in the verification flow (see usage of tasks.notes in the loop handling
verify_tid and the UPDATE that appends ' verified:...') silently fails; add a
schema migration that creates a TEXT (or VARCHAR) notes column on tasks (or
alter the table to add notes if it exists), update the initialization/migration
sequence in this script to apply that migration before Phase 6 runs, and ensure
the UPDATE uses the same sql_escape helper and COALESCE logic already present so
verification markers persist and errors are logged if the ALTER/CREATE fails.
- Around line 4888-4890: The failing branch calls an undefined function
log_verbose causing pulse abort; change the call to the existing log_info
function in the same branch so the message reads "Unverified: $verify_tid (PR
not merged or not found)". Locate the block that prints the unverified task
status (the else branch that references $verify_tid) and replace the log_verbose
invocation with log_info to use the script's defined logger.
- Around line 4871-4877: The verification currently calls gh pr view with only
pr_number (see verify_pr_url, pr_number, pr_state) which depends on current repo
context; update the logic to extract the repo slug (owner/repo) from
verify_pr_url (or use the task's repo variable if available) and pass it to gh
via --repo "<owner/repo>" when invoking gh pr view, ensure the extracted repo is
validated (non-empty) before use and adjust error logging for pr_state
resolution to include the repo context so false verifications are avoided.
In @.agents/workflows/plans.md:
- Around line 689-701: The Task Completion Rules section in MANDATORY: Task
Completion Rules (plans.md) omits the required verified:YYYY-MM-DD field,
causing agents to mark tasks [x] correctly per the three rules but fail
CI/issue-sync which enforces verified in issue-sync-helper.sh and the CI
validation job; update this section to add a fourth rule or an explicit note
instructing users to include verified:YYYY-MM-DD on completed [x] tasks (and
mention removing it when reverting), and reference the verified field name and
the enforcement locations (issue-sync-helper.sh and CI validation job) so
readers know to add that timestamp when marking a task done.
In @.github/workflows/code-quality.yml:
- Around line 151-153: The diff-based detection is inverted: NEW_COMPLETIONS
currently uses diff + grep '^>' which finds IDs unique to HEAD, not IDs that
moved from unchecked in BASE to checked in HEAD; replace that pipeline so it
computes the two sorted ID lists (from git show "$BASE":TODO.md with '^\- \[ \]
t[0-9]+' and from git show "$HEAD":TODO.md with '^\- \[x\] t[0-9]+') and take
their intersection (IDs present in both lists) instead of the diff '>' output;
in short, change the logic that sets NEW_COMPLETIONS to use an intersection
(e.g., comm -12 on the two sorted streams) so NEW_COMPLETIONS contains only task
IDs that transitioned from [ ] in BASE to [x] in HEAD.
🧹 Nitpick comments (2)
.agents/scripts/issue-sync-helper.sh (2)
1008-1016: Solid safety guard — the core fix for the 38-issue cascade.The
verified:YYYY-MM-DDregex check and skip logic are exactly right for preventing premature closures. One feedback concern per the coding guidelines:
log_verboseon line 1012 means that in normal (non-verbose) mode, users get zero per-task feedback about why specific tasks were skipped — only the aggregate count at line 1040. For a safety-critical skip,print_warningwould give operators clear, immediate visibility without needing--verbose.Suggested improvement
- log_verbose "Skipping #$issue_number ($task_id): no verified: field (use --force to override)" + print_warning "Skipping #$issue_number ($task_id): no verified: field (use --force to override)"As per coding guidelines,
.agents/scripts/*.shshould have "Clear logging and feedback."
1233-1236:--forceis accepted globally but only meaningful forclose.Currently, passing
--forcewith other commands (e.g.,push --force,enrich --force) silently succeeds and does nothing. This is fine for now, but consider logging a warning if--forceis supplied with a non-close command to prevent user confusion.
| ## MANDATORY: Task Completion Rules | ||
|
|
||
| **NEVER mark a task `[x]` unless ALL of these are true:** | ||
|
|
||
| 1. **YOU wrote code/content** for it in THIS session (not just verified a file exists) | ||
| 2. **Deliverable is substantive** — file exists, content is complete (not a stub/placeholder), passes quality checks | ||
| 3. **You can cite what you did** — specific files created/modified, lines changed | ||
|
|
||
| **If someone else did the work** (worker branch, merged PR, another session): leave it `[ ]`. The repo owner or reviewer marks it done after verification. | ||
|
|
||
| **Why this matters**: Marking `[x]` triggers `issue-sync` GitHub Action which auto-closes the linked GitHub Issue. False completions cascade — 38 issues were incorrectly closed in a single incident (PR #616). | ||
|
|
||
| **The `actual:` field** should only appear on completed `[x]` tasks. Remove it when reverting a task to `[ ]`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation gap: verified: field not mentioned in completion rules.
The safety guard in issue-sync-helper.sh (line 1011) and the CI validation job both require a verified:YYYY-MM-DD field on completed tasks. However, this section — the primary reference agents will read — only mentions the actual: field (line 701) and never tells the user to add verified: when marking a task [x].
Without this, agents will follow the three rules, mark [x], and then be surprised when issue-sync close skips their task or CI emits a warning.
Consider adding a fourth rule or a note, e.g.:
Suggested addition
3. **You can cite what you did** — specific files created/modified, lines changed
+4. **Add `verified:YYYY-MM-DD`** — this field is required by `issue-sync close` to auto-close the linked GitHub Issue
**If someone else did the work** (worker branch, merged PR, another session): leave it `[ ]`. The repo owner or reviewer marks it done after verification.🤖 Prompt for AI Agents
In @.agents/workflows/plans.md around lines 689 - 701, The Task Completion Rules
section in MANDATORY: Task Completion Rules (plans.md) omits the required
verified:YYYY-MM-DD field, causing agents to mark tasks [x] correctly per the
three rules but fail CI/issue-sync which enforces verified in
issue-sync-helper.sh and the CI validation job; update this section to add a
fourth rule or an explicit note instructing users to include verified:YYYY-MM-DD
on completed [x] tasks (and mention removing it when reverting), and reference
the verified field name and the enforcement locations (issue-sync-helper.sh and
CI validation job) so readers know to add that timestamp when marking a task
done.
- Fix inverted CI diff logic: use comm -12 for intersection instead of diff+grep - Replace undefined log_verbose with log_warn in Phase 6 - Use state_log for verification tracking instead of nonexistent notes column - Add --repo flag to gh pr view (extract repo slug from PR URL) - Optimize Phase 6 to single DB query instead of N+1 - Distinguish verified vs force-closed in issue close comments - Add verified: field as 4th completion rule in plans.md - Add pointer to plans.md completion rules in AGENTS.md
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sun Feb 8 14:43:04 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|



Summary
[x]unless they wrote the codeissue-sync-helper.sh cmd_close()now requiresverified:field before closing GitHub Issues (--force to override)[x]withoutverified:fieldAddresses root cause of 38 issues being falsely auto-closed (PR #616). ref:GH#618
Summary by CodeRabbit
New Features
Documentation