feat(dispatcher): invoke checkbox-revert reconciler on agent push#156
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (6)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughThis PR adds a checkbox-revert poller pass: typed workflow meta_json accessors for checkbox reconcile state, GitHub PullRequest.headSha support, a runCheckboxRevertPass orchestrator (rate-limit gated, per-workflow error-isolated), cron wiring to start the pass, comprehensive tests, and planning/decision docs. ChangesCheckbox-revert poller pass integration
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Comment |
thejustinwalsh
left a comment
There was a problem hiding this comment.
Design rationale distilled from planning/issues/101/decisions.md — left as inline anchors so the reasoning sits next to the code it explains.
| * transition and reverts a box whose gates fail. The per-workflow diff base | ||
| * (`{ headSha, checkboxState }`) is persisted in the workflow row's `meta_json`. | ||
| * | ||
| * Why a poller pass and not a Stop hook: the reconciler is GitHub-state-driven |
There was a problem hiding this comment.
Why a poller pass, not a Stop hook (decision). The issue offered either; the codebase decides it. The reconciler is GitHub-state-driven (reads/writes the PR body) — the poller's domain, with reconcileMergedParks as the precedent for "a second reconcile pass over a different workflow set on the same cron." The package CLAUDE.md states the hook server runs no business logic; running gates (tens of seconds of lint/test) inside a hook HTTP handler would block the response and break that invariant. "Keyed on head SHA advancing" is a polling comparison across ticks, not an event. So no architectural fork — conventions pick the poller.
| for (const wf of running.slice(0, maxPerPass)) { | ||
| try { | ||
| const config = loadConfig(wf.worktreePath); | ||
| if (!config) continue; // no gates to enforce → nothing to revert |
There was a problem hiding this comment.
No verify.toml → skip the workflow (decision). With no gates declared, a ticked box can never fail one, so there's nothing to revert and tracking state would be dead work. Mirrors verify-on-stop's "missing/malformed verify.toml → skip (ok)" in build-deps.ts.
| // Head-SHA gate: skip the parse/diff/gate-run when the PR hasn't advanced | ||
| // since the last tick. A gateway that can't supply a SHA (undefined) falls | ||
| // through — the reconciler's own checkbox-state diff still gates the work. | ||
| if (pr.headSha !== undefined && pr.headSha === previous.headSha) continue; |
There was a problem hiding this comment.
Head-SHA gate skips the work, not the fetch (decision). The SHA must be fetched to be compared, so the gate saves the expensive work (lint/typecheck/test, parse, body writes), not the one cheap read. When the gateway can't supply a SHA (undefined), we fall through and let reconcileCheckboxes's own checkbox-state diff provide idempotence — only [ ] → [x] transitions run gates regardless.
| * checkbox-revert persist reset a running agent's idle-timeout clock — masking a | ||
| * genuinely wedged agent (e.g. on first observation after a daemon restart). | ||
| */ | ||
| export function patchWorkflowMeta(db: Database, id: string, patch: Partial<WorkflowMeta>): void { |
There was a problem hiding this comment.
Persist in meta_json, and never bump updated_at (decisions). Storing { headSha, checkboxState } in the existing meta_json scratch column (precedent: source) avoids a migration; the merge-on-write preserves source. Critically, this write must not touch updated_at: the watchdog folds updated_at into its idle-freshness baseline (watchdog.ts:206), and both it and this pass run over the same state = 'running' set — bumping it here would reset a running agent's idle-timeout clock and mask a wedged agent (caught in internal review).
Reviewer's brief — PR #156 (Closes #101)What this delivers: the production trigger that fires the checkbox-revert How to run itbun install
bun run typecheck # clean
bun test # 733 pass, 0 fail (81 files)
bun run lint # cleanTargeted suites: What to verify (and what "correct" looks like)
How to reviewRead Fragile / needs extra eyes
The branch is |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/dispatcher/src/gates/checkbox-revert-pass.ts`:
- Line 49: Add a TSDoc comment for the exported type CheckboxRevertPassDeps
describing its responsibility and the contract of its properties; locate the
exported type declaration named CheckboxRevertPassDeps in the file and insert a
brief /** ... */ TSDoc block above it that explains what the dependency object
contains and how consumers should use it (e.g., list and describe each
property/interface the type expects).
In `@packages/dispatcher/src/main.ts`:
- Around line 604-627: The poller call to startPoller is currently inheriting
the 120_000ms default; explicitly pass the documented cadence constants so the
daemon runs every 60s and watchdog at 30s: import/define WATCHDOG_INTERVAL_MS =
30_000 and POLLER_INTERVAL_MS = 60_000 and supply POLLER_INTERVAL_MS to
startPoller (e.g., as the explicit interval argument or option) when invoking
startPoller (the call that returns stopPoller and sets up checkboxRevert) so the
checkboxRevert trigger runs at the intended 60s cadence.
In `@packages/dispatcher/src/workflow-record.ts`:
- Around line 134-136: getCheckboxReconcileState currently trusts
readWorkflowMeta().checkboxReconcile but that nested value can be malformed;
update getCheckboxReconcileState to defensively validate the nested shape and
return the safe default when invalid. Specifically, after calling
readWorkflowMeta(db, id) check that checkboxReconcile is an object, that its
headSha is either string or null, and that state is a plain object (not
null/array/primitive); if any check fails return { headSha: null, state: {} }
(preserving CheckboxReconcileState contract). Use the function names
getCheckboxReconcileState and readWorkflowMeta and the type
CheckboxReconcileState to locate where to add these runtime guards.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 0e4e3c4e-8c22-4fc8-a49e-96acbfb164a3
📒 Files selected for processing (9)
packages/dispatcher/src/gates/checkbox-revert-pass.tspackages/dispatcher/src/github.tspackages/dispatcher/src/main.tspackages/dispatcher/src/poller-cron.tspackages/dispatcher/src/workflow-record.tspackages/dispatcher/test/gates/checkbox-revert-pass.test.tspackages/dispatcher/test/workflow-record.test.tsplanning/issues/101/decisions.mdplanning/issues/101/plan.md
getCheckboxReconcileState trusted readWorkflowMeta().checkboxReconcile, but readWorkflowMeta only guards the top-level JSON shape — a malformed nested value (version skew, a hand-edited row) could violate the CheckboxReconcileState return contract at runtime. Sanitize it back to the contract (non-object/array -> default, non-string headSha -> null, state rebuilt keeping only boolean entries), matching the validate-don't-trust posture getWorkflowSource already uses.
The poller default was tuned 60s->120s in 8db12e8 for rate-limit safety, but the package CLAUDE.md still claimed POLLER_INTERVAL_MS = 60_000 and main.ts's comment said 'every 60s'. Correct both to the actual 120s default (injectable; the daemon inherits it), with the rationale and commit reference. No behavior change.
|
@coderabbitai resolve All three findings from the prior review are addressed:
|
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
# List all open review threads on PR `#156` to find ones created by coderabbitai
gh api graphql -f query='
{
repository(owner: "thejustinwalsh", name: "middle") {
pullRequest(number: 156) {
reviewThreads(first: 50) {
nodes {
id
isResolved
comments(first: 1) {
nodes {
author { login }
body
}
}
}
}
}
}
}' | jq '.data.repository.pullRequest.reviewThreads.nodes[] | select(.isResolved == false and .comments.nodes[0].author.login == "coderabbitai") | {id, body: .comments.nodes[0].body[:100]}'Length of output: 831 🏁 Script executed: #!/bin/bash
# Resolve the open review thread
gh api graphql -f query='
mutation {
resolveReviewThread(input: { threadId: "PRRT_kwDOSdBe186E4Uqa" }) {
thread {
id
isResolved
}
}
}'Length of output: 340
|
Summary
Closes #101
The checkbox-revert reconciler and the Phase 6 gate machinery were both fully
built and unit-tested, but nothing in production invoked them — the
"after a push, run the declared gates and revert a failing Status checkbox" loop
was wired in code and proven by
verify.test.ts, yet no event fired it on a realpush. This PR closes that gap with a new pass on the GitHub poller cron.
What changed
packages/dispatcher/src/gates/checkbox-revert-pass.ts(new) —runCheckboxRevertPass: over each running implementation workflow, read the Epic PR; when its head SHA advanced since the last tick (the agent pushed), composereconcileCheckboxes+makeRunPhaseGatesto run sub-issue N's gates for each[ ] → [x]Status transition and revert + comment on failures. This ismakeRunPhaseGates's first production consumer.packages/dispatcher/src/workflow-record.ts— typedmeta_jsonaccessors (readWorkflowMeta/patchWorkflowMeta, refactoringgetWorkflowSourceonto them), the per-pass diff-base accessors (get/setCheckboxReconcileState), andlistRunningImplementationWorkflows.packages/dispatcher/src/github.ts—PullRequest.headSha, mapped fromgh'sheadRefOidinfindEpicPr/getPullRequest.packages/dispatcher/src/poller-cron.ts— run the new pass as a third guarded pass each tick (StartPollerOptions);main.tswires it with the write-capableghGitHubgateway + the free rate-limit read.Why these changes
The reconciler is GitHub-state-driven (it reads and writes the PR body), so its
home is the GitHub poller — the same place as
reconcileMergedParks, not a Stophook (the hook server deliberately runs no business logic, and running gates
inside a hook handler would block it). "Keyed on the Epic PR's head SHA advancing"
is a polling comparison across ticks. The pass lives in its own module so
poller.ts's "never writes to GitHub" invariant stays true. The per-workflowdiff base
{ headSha, checkboxState }persists in the existingmeta_jsonscratchcolumn (precedent:
source) — no migration. Full reasoning inplanning/issues/101/decisions.md, distilled into inline review comments on this PR.Verification
bun run typecheck— clean.bun test— 733 pass, 0 fail (81 files), including 16 new tests acrosstest/gates/checkbox-revert-pass.test.ts(revert on fail, passing box stays, head-SHA gate skips/advances, undefined-SHA fall-through, no-config skip, rate-limit ceiling, per-workflow isolation, parked-workflow exclusion) andtest/workflow-record.test.ts(meta merge preservessource, state round-trip,updated_atnot bumped, running-workflow query filtering).bun run lint/bun run format— clean.updated_at/watchdog coupling below; second round returned a clean bill.Acceptance criteria
reconcileCheckboxesafter an agent push — done viarunCheckboxRevertPasson the poller cron (checkbox-revert-pass.ts, wired in main.ts).PullRequest.headShain github.ts.runGatesfilled withmakeRunPhaseGates+loadVerifyConfig(verifyConfigPath(...))— see the deps composition.getPreviousState/setPreviousStatein the workflow record —meta_jsonaccessors in workflow-record.ts#L118.Status
Stumbling points
updated_at/watchdog coupling. The first cut hadpatchWorkflowMetabumpupdated_at(mirroring other writers). Internal review caught that the watchdog foldsupdated_atinto its idle-freshness baseline (watchdog.ts:206), and both run over the samestate='running'set — so persisting checkbox scratch would have reset a running agent's idle clock and masked a wedged agent. Fixed: meta writes never touchupdated_at(it's scratch, not activity). Would have been avoided by a note onupdated_at's consumers in the package CLAUDE.md.build-deps.ts) runs gates withcwd= the agent's worktree, so an auto-fixing gate (oxlint --fix) could touch files the agent is mid-edit on. It fires when the PR head SHA advanced (just-pushed, typically quiescent), and it's the design the issue specifies — but it's a pre-existing class of concern shared with verify-on-stop, worth eyes.Suggested CLAUDE.md updates
packages/dispatcher/CLAUDE.md, under the watchdog section: note thatupdated_atis part of the idle-freshness baseline, so non-activity writes (scratch/meta_json) must not bump it.Architectural forks
None — project conventions decided the trigger mechanism (no genuine ambiguity).
Follow-up issues
None. The scope was fully delivered; nothing genuinely parallelizable surfaced. The burst-cap-before-SHA-gate ordering matches the existing
runPollerpattern (you can't know which workflow pushed without thefindEpicPrcall the cap bounds), andfindEpicPrlisting all open PRs per call is pre-existing.Out of scope
The reconciler's checkbox detection/diff/revert mechanics (Phase 4) and the gate machinery (
verify.ts/gate-runner/gate-evidence, #37) — consumed as-is. Durable persistence of parked executions across daemon restarts (#116).Summary by CodeRabbit
New Features
Tests
Documentation