Skip to content

Conversation

@alex-solovyev
Copy link
Collaborator

@alex-solovyev alex-solovyev commented Feb 8, 2026

Summary

  • Add task completion rules to AGENTS.md and plans.md — agents must never mark [x] unless they wrote the code
  • issue-sync-helper.sh cmd_close() now requires verified: field before closing GitHub Issues (--force to override)
  • Supervisor Phase 6: verify deliverables by checking PR merge status before marking verified
  • CI: new TODO.md validation step warns on PRs when tasks marked [x] without verified: field

Addresses root cause of 38 issues being falsely auto-closed (PR #616). ref:GH#618

Summary by CodeRabbit

  • New Features

    • Automated verification of deployed deliverables to confirm PR merges before marking task completion.
    • Added CI/CD validation step to check verified dates on all task completions.
    • Introduced --force flag to override verification requirements when closing tasks.
  • Documentation

    • Codified task completion rules, approval criteria, and ownership/reporting model for the workflow.

- AGENTS.md + plans.md: add task completion rules (never mark [x] unless
  you wrote the code, marking [x] triggers issue auto-close cascade)
- issue-sync-helper.sh: cmd_close() now requires verified: field to close
  issues, --force flag to override (prevents false closure cascade)
- supervisor-helper.sh: add Phase 6 verify step - checks PR is merged
  before marking deliverable as verified
- code-quality.yml: add TODO.md validation step on PRs - warns when tasks
  marked [x] without verified: field

Addresses root cause of 38 issues being falsely auto-closed (PR #616).
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 8, 2026

Warning

Rate limit exceeded

@alex-solovyev has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 7 minutes and 21 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Walkthrough

The PR implements a comprehensive task completion validation framework spanning documentation, shell automation, and CI/CD. Changes establish explicit task completion rules with verification requirements, introduce a --force flag to enforce verified date fields in issue-sync operations, add a Phase 6 supervisor verification step checking merged GitHub PRs, and deploy automated CI validation for TODO.md task transitions.

Changes

Cohort / File(s) Summary
Documentation & Rules
.agents/AGENTS.md, .agents/workflows/plans.md
Defines mandatory task completion rules, ownership model, and status reporting workflow ensuring tasks are marked complete only with verified timestamps and proper authorization gates.
Script Enhancements
.agents/scripts/issue-sync-helper.sh, .agents/scripts/supervisor-helper.sh
Extends issue-sync with --force flag and verified date enforcement; adds Phase 6 to supervisor for post-merge verification by checking GitHub PR status and appending verified timestamps to task notes.
CI/CD Validation
.github/workflows/code-quality.yml
Introduces todo-validation job performing diff-based detection of new task completions and enforcing presence of verified date fields with non-blocking warnings.

Sequence Diagram(s)

sequenceDiagram
    participant Worker as Worker/User
    participant TODO as TODO.md
    participant GitHub as GitHub Actions
    participant IssueSync as issue-sync-helper
    participant Supervisor as supervisor-helper
    participant GH_API as GitHub API

    Worker->>TODO: Mark task complete (check ☑)
    Worker->>TODO: Add verified: timestamp
    
    Worker->>GitHub: Push commit
    GitHub->>GitHub: Trigger code-quality workflow
    
    GitHub->>TODO: Validate TODO.md diff
    GitHub->>GitHub: Check verified: field present
    alt Missing verified: field
        GitHub->>GitHub: Emit warning (non-blocking)
    end
    
    GitHub->>IssueSync: PR merged, trigger sync
    IssueSync->>TODO: Check task status
    alt No verified: date (without --force)
        IssueSync->>IssueSync: Skip task, count skipped
    else verified: date present or --force
        IssueSync->>GitHub: Close/sync issue
    end
    
    Supervisor->>Supervisor: Run Phase 6 (Verify Deliverables)
    Supervisor->>TODO: Find deployed/merged tasks
    Supervisor->>GH_API: Check PR merge status
    alt PR is MERGED
        Supervisor->>TODO: Append verified timestamp to notes
        Supervisor->>Supervisor: Increment verified counter
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Possibly related PRs

Poem

✅ Tasks now verified, with timestamps in place,
GitHub Actions validates the completion race,
Phase 6 checks PRs, ensuring they're merged,
While --force flags guard what once diverged.
Automation ensures zero debt persists—
Completion rules perfected by our DevOps fists! 🚀

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and specifically describes the main objective: preventing false task completion cascades, with a clear reference to the related issue (t163).
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/t163-task-completion-safety

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @alex-solovyev, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces robust mechanisms to prevent the erroneous auto-closure of GitHub issues by implementing stricter verification steps for task completion. It updates documentation to guide agents on proper task marking, modifies the issue synchronization script to enforce explicit verification, and enhances the supervisor to automatically confirm deliverables based on PR merge status. These changes aim to improve the reliability of the task management system and avoid cascades of false completions.

Highlights

  • Stricter Task Completion Rules: New mandatory guidelines have been added to AGENTS.md and plans.md to prevent false task completions, emphasizing that agents should only mark tasks as complete if they personally wrote the code/content and verified the deliverable's substance.
  • Enhanced Issue Closing Logic: The issue-sync-helper.sh cmd_close() command now requires a verified: field in the task description before closing a GitHub Issue. A new --force option is available to override this check.
  • Automated Deliverable Verification: A new 'Phase 6' has been introduced in supervisor-helper.sh to automatically verify deliverables for 'deployed' tasks by checking if their associated Pull Requests have been merged, subsequently adding a verified: timestamp to the task notes.
Changelog
  • .agents/AGENTS.md
    • Added a new 'Task completion rules' section outlining conditions for marking tasks as complete, including personal code contribution and substantive deliverable verification.
  • .agents/scripts/issue-sync-helper.sh
    • Updated the close command to require a verified: field in the task description.
    • Introduced a --force option to bypass the verified: field check when closing issues.
    • Modified the issue closing comment to explicitly state '(verified)' for closed tasks.
    • Enhanced the summary output of cmd_close to report both closed and skipped issues.
  • .agents/scripts/supervisor-helper.sh
    • Implemented 'Phase 6: Verify deliverables for merged tasks' to check the merge status of associated PRs for 'deployed' tasks.
    • Automatically adds a verified: timestamp to task notes in the database if the corresponding PR is merged.
  • .agents/workflows/plans.md
    • Added a 'MANDATORY: Task Completion Rules' section with detailed criteria for marking tasks [x], emphasizing personal contribution and deliverable quality.
    • Explained the rationale behind these rules, referencing a past incident of false auto-closures.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/code-quality.yml
Activity
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link

github-actions bot commented Feb 8, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 36 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sun Feb 8 14:32:54 UTC 2026: Code review monitoring started
Sun Feb 8 14:32:55 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 36

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 36
  • VULNERABILITIES: 0

Generated on: Sun Feb 8 14:32:58 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces important safeguards to prevent tasks from being closed prematurely, addressing a key issue of false completion cascades. The changes are well-structured, with new documentation in AGENTS.md and plans.md to guide users, a verified: field requirement in issue-sync-helper.sh, and a new verification phase in the supervisor that checks PR merge status.

My review includes one suggestion for supervisor-helper.sh to improve the efficiency and clarity of the database queries in the new verification logic. Overall, this is a solid improvement to the workflow's reliability.

Comment on lines 4860 to 4891
merged_unverified=$(db "$SUPERVISOR_DB" "SELECT id FROM tasks WHERE status = 'deployed' AND id NOT IN (SELECT id FROM tasks WHERE notes LIKE '%verified:%');" 2>/dev/null || echo "")
if [[ -n "$merged_unverified" ]]; then
while IFS= read -r verify_tid; do
[[ -n "$verify_tid" ]] || continue
local verify_pr_url
verify_pr_url=$(db "$SUPERVISOR_DB" "SELECT pr_url FROM tasks WHERE id = '$(sql_escape "$verify_tid")';" 2>/dev/null || echo "")
local verify_worktree
verify_worktree=$(db "$SUPERVISOR_DB" "SELECT worktree FROM tasks WHERE id = '$(sql_escape "$verify_tid")';" 2>/dev/null || echo "")

# Check PR is actually merged
local pr_merged="false"
if [[ -n "$verify_pr_url" ]]; then
local pr_number
pr_number=$(echo "$verify_pr_url" | grep -oE '[0-9]+$' || echo "")
if [[ -n "$pr_number" ]]; then
local pr_state
pr_state=$(gh pr view "$pr_number" --json state --jq '.state' 2>/dev/null || echo "")
if [[ "$pr_state" == "MERGED" ]]; then
pr_merged="true"
fi
fi
fi

if [[ "$pr_merged" == "true" ]]; then
# PR is merged — deliverable verified
db "$SUPERVISOR_DB" "UPDATE tasks SET notes = COALESCE(notes, '') || ' verified:$(date +%Y-%m-%d)' WHERE id = '$(sql_escape "$verify_tid")';" 2>/dev/null || true
log_info " Verified: $verify_tid (PR merged)"
verified_count=$((verified_count + 1))
else
log_verbose " Unverified: $verify_tid (PR not merged or not found)"
fi
done <<< "$merged_unverified"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation for verifying deliverables has a few areas for improvement:

  1. Inefficient Database Access: It fetches task IDs first and then, inside a loop, fetches the pr_url for each task individually. This results in N+1 queries where N is the number of unverified tasks.
  2. Unused Variable: The verify_worktree variable is fetched from the database on line 4867 but is never used.
  3. Query Clarity: The id NOT IN (SELECT ...) subquery can be less clear and potentially less performant than a direct OR condition, especially concerning NULL values in the notes column.

I suggest refactoring this block to fetch all necessary data in a single, more efficient query. This also removes the unused variable and improves the query's readability.

    merged_unverified=$(db "$SUPERVISOR_DB" "SELECT id, pr_url FROM tasks WHERE status = 'deployed' AND (notes IS NULL OR notes NOT LIKE '%verified:%');" -separator $'\t' 2>/dev/null || echo "")
    if [[ -n "$merged_unverified" ]]; then
        while IFS=$'\t' read -r verify_tid verify_pr_url; do
            [[ -n "$verify_tid" ]] || continue

            # Check PR is actually merged
            local pr_merged="false"
            if [[ -n "$verify_pr_url" ]]; then
                local pr_number
                pr_number=$(echo "$verify_pr_url" | grep -oE '[0-9]+$' || echo "")
                if [[ -n "$pr_number" ]]; then
                    local pr_state
                    pr_state=$(gh pr view "$pr_number" --json state --jq '.state' 2>/dev/null || echo "")
                    if [[ "$pr_state" == "MERGED" ]]; then
                        pr_merged="true"
                    fi
                fi
            fi

            if [[ "$pr_merged" == "true" ]]; then
                # PR is merged — deliverable verified
                db "$SUPERVISOR_DB" "UPDATE tasks SET notes = COALESCE(notes, '') || ' verified:$(date +%Y-%m-%d)' WHERE id = '$(sql_escape "$verify_tid")';" 2>/dev/null || true
                log_info "  Verified:   $verify_tid (PR merged)"
                verified_count=$((verified_count + 1))
            else
                log_verbose "  Unverified: $verify_tid (PR not merged or not found)"
            fi
        done <<< "$merged_unverified"
    fi

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Fix all issues with AI agents
In @.agents/AGENTS.md:
- Around line 107-112: Add a single-line pointer immediately after the "Task
completion rules:" bullet block that points readers to the authoritative,
expanded guidance in workflows/plans.md by referencing its "Task Completion
Rules" section; insert the suggested one-line sentence (e.g., "_See
workflows/plans.md \"Task Completion Rules\" for detailed guidance including
citation requirements._") so AGENTS.md keeps the concise safety-critical summary
while explicitly directing readers to the full policy in plans.md.

In @.agents/scripts/issue-sync-helper.sh:
- Line 1032: Update the close-comment to reflect when verification was skipped:
change the code around the gh issue close call that currently uses issue_number,
repo_slug and task_id so it chooses the message based on the
force/skip-verification flag (the variable your script sets when --force is
passed); if force/skip is true omit or replace "(verified)" (e.g. "Completed.
Task $task_id marked done in TODO.md (forced, verification skipped)." or
"Completed. Task $task_id marked done in TODO.md.") and use that computed
message in the gh issue close command instead of the hard-coded "(verified)"
string.

In @.agents/scripts/supervisor-helper.sh:
- Around line 4859-4886: The migration is missing the tasks.notes column so the
UPDATE in the verification flow (see usage of tasks.notes in the loop handling
verify_tid and the UPDATE that appends ' verified:...') silently fails; add a
schema migration that creates a TEXT (or VARCHAR) notes column on tasks (or
alter the table to add notes if it exists), update the initialization/migration
sequence in this script to apply that migration before Phase 6 runs, and ensure
the UPDATE uses the same sql_escape helper and COALESCE logic already present so
verification markers persist and errors are logged if the ALTER/CREATE fails.
- Around line 4888-4890: The failing branch calls an undefined function
log_verbose causing pulse abort; change the call to the existing log_info
function in the same branch so the message reads "Unverified: $verify_tid (PR
not merged or not found)". Locate the block that prints the unverified task
status (the else branch that references $verify_tid) and replace the log_verbose
invocation with log_info to use the script's defined logger.
- Around line 4871-4877: The verification currently calls gh pr view with only
pr_number (see verify_pr_url, pr_number, pr_state) which depends on current repo
context; update the logic to extract the repo slug (owner/repo) from
verify_pr_url (or use the task's repo variable if available) and pass it to gh
via --repo "<owner/repo>" when invoking gh pr view, ensure the extracted repo is
validated (non-empty) before use and adjust error logging for pr_state
resolution to include the repo context so false verifications are avoided.

In @.agents/workflows/plans.md:
- Around line 689-701: The Task Completion Rules section in MANDATORY: Task
Completion Rules (plans.md) omits the required verified:YYYY-MM-DD field,
causing agents to mark tasks [x] correctly per the three rules but fail
CI/issue-sync which enforces verified in issue-sync-helper.sh and the CI
validation job; update this section to add a fourth rule or an explicit note
instructing users to include verified:YYYY-MM-DD on completed [x] tasks (and
mention removing it when reverting), and reference the verified field name and
the enforcement locations (issue-sync-helper.sh and CI validation job) so
readers know to add that timestamp when marking a task done.

In @.github/workflows/code-quality.yml:
- Around line 151-153: The diff-based detection is inverted: NEW_COMPLETIONS
currently uses diff + grep '^>' which finds IDs unique to HEAD, not IDs that
moved from unchecked in BASE to checked in HEAD; replace that pipeline so it
computes the two sorted ID lists (from git show "$BASE":TODO.md with '^\- \[ \]
t[0-9]+' and from git show "$HEAD":TODO.md with '^\- \[x\] t[0-9]+') and take
their intersection (IDs present in both lists) instead of the diff '>' output;
in short, change the logic that sets NEW_COMPLETIONS to use an intersection
(e.g., comm -12 on the two sorted streams) so NEW_COMPLETIONS contains only task
IDs that transitioned from [ ] in BASE to [x] in HEAD.
🧹 Nitpick comments (2)
.agents/scripts/issue-sync-helper.sh (2)

1008-1016: Solid safety guard — the core fix for the 38-issue cascade.

The verified:YYYY-MM-DD regex check and skip logic are exactly right for preventing premature closures. One feedback concern per the coding guidelines:

log_verbose on line 1012 means that in normal (non-verbose) mode, users get zero per-task feedback about why specific tasks were skipped — only the aggregate count at line 1040. For a safety-critical skip, print_warning would give operators clear, immediate visibility without needing --verbose.

Suggested improvement
-                log_verbose "Skipping #$issue_number ($task_id): no verified: field (use --force to override)"
+                print_warning "Skipping #$issue_number ($task_id): no verified: field (use --force to override)"

As per coding guidelines, .agents/scripts/*.sh should have "Clear logging and feedback."


1233-1236: --force is accepted globally but only meaningful for close.

Currently, passing --force with other commands (e.g., push --force, enrich --force) silently succeeds and does nothing. This is fine for now, but consider logging a warning if --force is supplied with a non-close command to prevent user confusion.

Comment on lines 689 to 701
## MANDATORY: Task Completion Rules

**NEVER mark a task `[x]` unless ALL of these are true:**

1. **YOU wrote code/content** for it in THIS session (not just verified a file exists)
2. **Deliverable is substantive** — file exists, content is complete (not a stub/placeholder), passes quality checks
3. **You can cite what you did** — specific files created/modified, lines changed

**If someone else did the work** (worker branch, merged PR, another session): leave it `[ ]`. The repo owner or reviewer marks it done after verification.

**Why this matters**: Marking `[x]` triggers `issue-sync` GitHub Action which auto-closes the linked GitHub Issue. False completions cascade — 38 issues were incorrectly closed in a single incident (PR #616).

**The `actual:` field** should only appear on completed `[x]` tasks. Remove it when reverting a task to `[ ]`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Documentation gap: verified: field not mentioned in completion rules.

The safety guard in issue-sync-helper.sh (line 1011) and the CI validation job both require a verified:YYYY-MM-DD field on completed tasks. However, this section — the primary reference agents will read — only mentions the actual: field (line 701) and never tells the user to add verified: when marking a task [x].

Without this, agents will follow the three rules, mark [x], and then be surprised when issue-sync close skips their task or CI emits a warning.

Consider adding a fourth rule or a note, e.g.:

Suggested addition
 3. **You can cite what you did** — specific files created/modified, lines changed
+4. **Add `verified:YYYY-MM-DD`** — this field is required by `issue-sync close` to auto-close the linked GitHub Issue

 **If someone else did the work** (worker branch, merged PR, another session): leave it `[ ]`. The repo owner or reviewer marks it done after verification.
🤖 Prompt for AI Agents
In @.agents/workflows/plans.md around lines 689 - 701, The Task Completion Rules
section in MANDATORY: Task Completion Rules (plans.md) omits the required
verified:YYYY-MM-DD field, causing agents to mark tasks [x] correctly per the
three rules but fail CI/issue-sync which enforces verified in
issue-sync-helper.sh and the CI validation job; update this section to add a
fourth rule or an explicit note instructing users to include verified:YYYY-MM-DD
on completed [x] tasks (and mention removing it when reverting), and reference
the verified field name and the enforcement locations (issue-sync-helper.sh and
CI validation job) so readers know to add that timestamp when marking a task
done.

- Fix inverted CI diff logic: use comm -12 for intersection instead of diff+grep
- Replace undefined log_verbose with log_warn in Phase 6
- Use state_log for verification tracking instead of nonexistent notes column
- Add --repo flag to gh pr view (extract repo slug from PR URL)
- Optimize Phase 6 to single DB query instead of N+1
- Distinguish verified vs force-closed in issue close comments
- Add verified: field as 4th completion rule in plans.md
- Add pointer to plans.md completion rules in AGENTS.md
@github-actions
Copy link

github-actions bot commented Feb 8, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 36 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sun Feb 8 14:43:01 UTC 2026: Code review monitoring started
Sun Feb 8 14:43:01 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 36

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 36
  • VULNERABILITIES: 0

Generated on: Sun Feb 8 14:43:04 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

sonarqubecloud bot commented Feb 8, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant