fix(dispatcher): idempotent createWorkflowRecord so a retried step can't mask errors with UNIQUE#159
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughThis PR fixes a latent bug in the ChangesIdempotent workflow record creation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Comment |
…n't mask errors with UNIQUE prepare-worktree (implementation workflow) runs createWorkflowRecord (INSERT on the workflows PK) then createWorktree, under bunqueue's default retry: 3. A transient createWorktree failure retries the step from the top, re-runs the INSERT for the same execution id, and surfaces `UNIQUE constraint failed`, masking the real error. The recommender/documentation check-rate-limit steps dodged this with retry: 1; prepare-worktree legitimately wants to retry. Fix at the source: INSERT OR IGNORE. The only PK collision is a same-execution retry — the no-op case — so a retried record-creating step surfaces the real downstream error. Hardens all three workflows at one point. Tests: unit test that a second create with the same id is a no-op (reproduces the masking UNIQUE before the fix); workflow test that a transient createWorktree failure now recovers to completed instead of failing on a masked UNIQUE.
…d PK conflict ON CONFLICT(id) DO NOTHING rather than a blanket INSERT OR IGNORE, so a genuine CHECK/NOT-NULL violation still throws instead of being silently swallowed; only the same-execution retry no-ops. Add a test that a bad kind still throws.
| const now = Date.now(); | ||
| const metaJson = input.source === undefined ? null : JSON.stringify({ source: input.source }); | ||
| db.run( | ||
| `INSERT INTO workflows |
There was a problem hiding this comment.
Decision: ON CONFLICT(id) DO NOTHING, not per-step retry: 1 or blanket INSERT OR IGNORE.
Two candidate fixes for the retry-masking bug:
retry: 1onprepare-worktree(mirror thecheck-rate-limitworkaround) — butcreateWorktreeis legitimately retriable (a transient git failure can succeed on retry);retry: 1throws that recovery away just to dodge the INSERT.- Idempotent INSERT — the INSERT is the only thing in the step that isn't safe to re-run. The PK is
ctx.executionId, unique per bunqueue execution, so the only way it collides is the same execution retrying — precisely the no-op case. Keeps the worktree retry intact and hardens all three record-creating steps (implementation/recommender/documentation) at one point.
Chose (2). Scoped to the id PK conflict rather than INSERT OR IGNORE on purpose: a blanket ignore would also swallow a genuine CHECK/NOT-NULL violation (a real bug), whereas ON CONFLICT(id) only no-ops the same-execution retry. Guarded by a test asserting a bad kind still throws.
The existing retry: 1 on the recommender/documentation check-rate-limit steps is left in place — independent rationale (a deterministic db-state check gains nothing from retrying), now belt-and-suspenders.
Reviewer's brief — PR #159 (Closes #108)What this fixes: How to run it: bun install
bun test packages/dispatcher/test/workflow-record.test.ts \
packages/dispatcher/test/implementation-workflow.test.ts
bun run typecheck && bun run lintWhat to verify (and what "correct" looks like):
How to review it: the change is one SQL clause + doc + three tests. Both tests were confirmed red→green (revert the clause and they fail with Fragile / worth extra eyes: the existing Mergeability: branch is even with |
Summary
Closes #108
createWorkflowRecordwas a plainINSERTintoworkflows(PK =id, the bunqueue execution id). The implementation workflow'sprepare-worktreestep registers with bunqueue's defaultretry: 3and callscreateWorkflowRecordbeforecreateWorktree. A transientcreateWorktreefailure retried the whole step, re-ran the INSERT for the sameid, and surfacedUNIQUE constraint failed: workflows.id— masking the real error. The recommender/documentationcheck-rate-limitsteps had dodged the same class with a per-stepretry: 1workaround.Fixed at the source:
INSERT ... ON CONFLICT(id) DO NOTHINGmakes the record-creating step idempotent. The only way the PK collides is a same-execution retry — exactly the case to no-op — so a retried step now surfaces the real downstream error. Scoped to theidPK (not a blanketINSERT OR IGNORE) so a genuine CHECK/NOT-NULL violation still throws. Hardens all three workflows (implementation, recommender, documentation) at one point.What changed
packages/dispatcher/src/workflow-record.ts—createWorkflowRecordINSERT →ON CONFLICT(id) DO NOTHING, with a doc note on the idempotency contract and why it's PK-scoped.packages/dispatcher/test/workflow-record.test.ts— idempotent-retry no-op test (a second create with the same id leaves the existing row untouched); bad-kindtest (a real CHECK violation still throws — guards the PK-scoping).packages/dispatcher/test/implementation-workflow.test.ts— a transientcreateWorktreefailure retries and recovers tocompletedinstead of failing on a masked UNIQUE.Why these changes
prepare-worktree'screateWorktreeis legitimately retriable (a transient git failure can succeed on retry), soretry: 1(the existing workaround) would throw that recovery away. The INSERT is the only non-idempotent thing in the step; making it idempotent on the PK preserves the retry semantics and fixes the whole class at the source rather than per-step.ON CONFLICT(id)overINSERT OR IGNOREkeeps genuine schema-constraint violations throwing.Status
createWorkflowRecord+ testsVerification
bun test packages/dispatcher/→ 405 pass, 0 failbun test(full repo) → 722 pass, 0 failbun run typecheck→ cleanbun run lint→ clean (--deny-warnings)UNIQUE constraint failed: workflows.idand the workflow test fails to reachcompleted; with the fix both pass.Acceptance criteria
createWorkflowRecordidempotent so a retried record-creating step is a no-op, not aUNIQUEerror —ON CONFLICT(id) DO NOTHINGat https://github.com/thejustinwalsh/middle/blob/middle-issue-108/packages/dispatcher/src/workflow-record.ts#L67 ; hardens both the implementationprepare-worktreeand the recommender/documentation workflows at the source.Stumbling points
bun installfirst). bunqueue installs undernode_modules/.bun/, so confirming its defaultretry: 3meant resolving the hoisted path manually.Suggested CLAUDE.md updates
None. The existing
## state-issue contract/ dispatcher conventions already cover the relevant invariants.Follow-up issues
None. The reviewer's pass confirmed
recordEvent(append-only, autoincrement PK) is not the same class, and the other workflow INSERTs already useINSERT OR IGNOREor aren't on the retry path.Out of scope
retry: 1on the recommender/documentationcheck-rate-limitsteps (independent rationale; left as defense-in-depth).Decisions
Posted as an inline review comment on
workflow-record.ts; full log inplanning/issues/108/decisions.md.Strategy
Branch is even with
origin/main(no rebase/merge needed);mergeable: MERGEABLE.Summary by CodeRabbit
Bug Fixes
Tests