feat(bg): B-0501 slice 5a — assignment history dedup cooldown#3597
feat(bg): B-0501 slice 5a — assignment history dedup cooldown#3597AceHack wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an assignment-history dedup cooldown to the backlog-ready notifier (B-0501, slice 5a of B-0441). On each poll the notifier now first gates publishing on the target agent's queue being empty, then filters ready rows against a persisted history file, and after publishing writes back a pruned history via temp-file + rename.
Changes:
- Extend
NotifierConfig/PollResultwithtargetAgent,historyFile,cooldownMin,queueBusy, andskippedDueToCooldown, plus CLI flags--target-agent,--history-file,--cooldown-min. - Add
readHistoryFile/writeHistoryFileadapters (real impl usesmkdirSync+ temp-file +renameSync) and wire pollOnce to check cooldown, skip duplicates, and prune+persist on publish. - Add tests covering empty history, in-cooldown skip, post-cooldown re-assign, mixed expired/active entries, pruning on write, and the new parseArgs flags.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| tools/bg/backlog-ready-notifier.ts | New config/result fields, history-file adapters, cooldown filter + prune-and-write, expanded CLI parsing, queue-busy short-circuit. |
| tools/bg/backlog-ready-notifier.test.ts | Extended fake adapters with history hooks; new test groups for queue-busy interaction, cooldown dedup behaviour, and new CLI flags. |
Comments suppressed due to low confidence (1)
tools/bg/backlog-ready-notifier.ts:225
- The temp filename uses
${path}.tmp.${Date.now()}for uniqueness, butDate.now()has millisecond resolution and concurrent writes within the same millisecond will collide and corrupt each other's temp file before rename. Consider appendingprocess.pidand/or a random suffix (e.g.crypto.randomUUID()) to make collisions astronomically unlikely.
const tmp = `${path}.tmp.${Date.now()}`;
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f0ba90a599
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Vera read-only review-state triage from the main-backed control clone, 2026-05-15T21:59Z. I did not patch this branch because it is Current state I see:
Safe owner-side next step: update from No contested-root writes and no overlapping branch edits from Vera. |
…ldown row tracking + target-agent validation
Four substantive issues flagged by Copilot + Codex reviewers:
1. **Codex P1 (line 319)** — readHistoryFile returns AssignmentHistory|null,
but pollOnce assumed `.entries` was always an array. A file containing
`{}` or `{entries: "garbage"}` would throw at `.filter(...)` and abort
the notifier loop. Added `normalizeHistory()` that coerces malformed
shapes to `{entries: []}` and per-entry validates rowId+publishedAt
are strings. Two new regression tests pin the contract.
2. **Codex P2 (line 355)** — cooldown history was reconstructed by
indexing `toAssign[i]` against `publishedEnvelopeIds`, but the publish
loop `continue`s on invalid priority, so the indices drift. A skipped
row's id was being written as the cooldown entry while the actually-
published row was missing from cooldown and could be re-assigned.
Refactored to track `{envelopeId, rowId}` pairs in publishedPairs; both
the return shape and cooldown entries derive from the pair list.
New test exercises the bad-priority-first scenario.
3. **Copilot (line 36)** — `targetAgent: string` accepted any value;
a typo like `--target-agent ott` silently no-opped the queue-busy
check via `isAgentQueueEmpty`'s "unknown agent = empty" branch.
Added `parseTargetAgent()` that validates against `Object.keys(AGENT_MAP)`
(otto/alexa/lior/vera/riven). Tests bypass parseArgs to inject
custom patterns via adapters, preserving testability. Two new
parseArgs tests pin the validation.
4. **Copilot (line 230)** — narrowed the atomic-rename guarantee
docstring on `writeHistoryFile`. The tmp-then-rename pattern
prevents torn reads but does NOT serialize concurrent writers;
the dedup/cooldown contract holds for single-process notifier only.
CAS-on-content or file-locking is out of scope for slice 5a.
Test suite: 46 pass (up from 39) in backlog-ready-notifier.test.ts;
125 pass across all tools/bg tests. tsc clean.
Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b2033b47c1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ), | ||
| ...publishedPairs.map(p => ({ rowId: p.rowId, publishedAt: pollAt.toISOString() })), | ||
| ]; | ||
| adapters.writeHistoryFile(config.historyFile, { entries: newEntries }); |
There was a problem hiding this comment.
Catch history write failures before returning poll result
A filesystem error while persisting cooldown history will currently throw out of pollOnce, because adapters.writeHistoryFile(...) is not guarded by a try/catch. In daemon mode this can terminate the notifier loop on transient conditions (permission issues, full disk, path problems), even though publish failures are explicitly handled as non-fatal. This turns optional dedup bookkeeping into a single-point process crash.
Useful? React with 👍 / 👎.
| fromAgent: "otto", | ||
| toAgent: "*", | ||
| maxAssignments: 3, | ||
| targetAgent: "otto", |
There was a problem hiding this comment.
Avoid hard-coding otto as the queue target default
The new queue gate defaults targetAgent to "otto", so runs that change assignee/sender (for example via --to/--agent) but omit --target-agent will still evaluate Otto’s queue state. That can cause assignments to be skipped or sent based on the wrong person’s activity, violating the queue-empty gating for non-Otto workflows.
Useful? React with 👍 / 👎.
* shard(tick): 2026-05-16T00:44Z — bg-worker triage, 4 BLOCKED PRs all peer-owned Otto-CLI background-worker fire (worktree `jiggly-riding-corbato`). Triaged 40 open PRs; 4 of 6 actionable are claimed by active peer workers (Lior decompose branches + sibling Claude worker on PR #3597). No PR matches the strict `BLOCKED + resolve-threads` criterion in this worker's task packet — threads are gated by failed CI / dirty merge state. Substrate-honest move: leave the branches to their owners + write visibility-shard so other Otto-CLI fires don't redundantly survey. Composes with: - `.claude/rules/holding-without-named-dependency-is-standing-by-failure.md` (named dependencies present; not Standing-by) - `.claude/rules/claim-acquire-before-worktree-work.md` (PR-thread- resolution is DOES-NOT-APPLY but worktree contention still applies) - B-0519 (multi-Otto-branch-state-contamination RCA) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(tick 0044Z): correct thread-carrying PR list + link rule path - Reconcile bullet with table: 4 thread-carrying PRs are 3597 / 3633 / 3621 / 3520 (not 3643 — that has 0 threads). 3621 is the parent of 3633 on `feat/b0449-b0460-subscribe`, not itself a Lior decompose branch. - Link `.claude/rules/claim-acquire-before-worktree-work.md` and the B-0519 RCA row to their full repo-relative paths per the rule-link convention from tick 2026-05-15T0503Z. Caught by Copilot review threads on PR #3649. Co-Authored-By: Claude <noreply@anthropic.com> * fix(tick 0044Z): correct relative link depth for .claude/rules and backlog/P3 paths Per Codex review threads on PR #3649: - Line 34: claim-acquire rule link needed 6 ../ to reach repo root from docs/hygiene-history/ticks/2026/05/16/ (was 5; resolved to docs/.claude/...) - Line 36: B-0519 backlog link needed 5 ../ (was 4; resolved to docs/hygiene-history/backlog/... which does not exist) Both links now resolve correctly from the nested tick path. * shard(0044Z): address Copilot review on PR #3649 Two findings from copilot-pull-request-reviewer: 1. Action-classification inconsistency: the shard said PRs #3633 and #3643 "both fail an identical set of 5 lint checks" while the table classified them with different nextActions (#3633→fix-failed-checks, #3643→wait-ci). Clarify the 5 lints are NON-required and explain the action divergence comes from independent required-check state, not from the shared lint signature. 2. Bare filename `refresh-world-model-poll-pr-gate.md` → full path link `[.claude/rules/refresh-world-model-poll-pr-gate.md](../../../../../../.claude/rules/refresh-world-model-poll-pr-gate.md)` matching the link convention used elsewhere in this shard. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(markdownlint): prefix bare #3643 with "PR" to avoid MD018 Line 51 starts with `#3643` after wrap, which markdownlint reads as an atx heading without space (MD018/no-missing-space-atx). Prefixing with "PR " makes it a normal sentence continuation and matches the "PR #NNNN" style used elsewhere in the shard. Unblocks the required `lint (markdownlint)` check on PR #3677. Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…ependabot PRs merged (#3798, #3810) (#3909) * shard(tick): 2026-05-16T16:43Z otto-bg-worker — catch-43 re-arm + 2 dependabot PRs merged Cron sentinel was missing at session start; re-armed via CronCreate with <<autonomous-loop>> sentinel (job 3933804e). Two dependabot PRs went green-to-merge once auto-merge was armed: - #3798 codeql-action 4.35.4→4.35.5 (merged fe6c4e9) - #3810 FSharp.Core + 3 others NuGet bump (merged 16:42:27Z) Both had required checks green; only non-required lint was non-success (cancelled on #3798, archive-header §33 mis-fire on #3810). GraphQL traversed Normal→Cost-aware→Extreme cost-aware (1073→645 in single tick). poll-pr-gate-batch.ts on 10 PRs costs ~N×35 GraphQL — documented in the shard. Other polled PRs (#3545, #3597, #3599, #3633, #3643, #3670, #3714, #3813) deferred to claim-owner per honor-those-that-came-before + claim-acquire-before-worktree-work disciplines. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(shard-1643z): correct filename reference 06:42Z.md → 0642Z.md (Copilot finding) --------- Co-authored-by: Claude <noreply@anthropic.com>
Implements backlog row B-0501 (Slice 5a of the B-0441 notifier). We now properly check assignment history to avoid re-assigning the same row multiple times within a configurable cooldown period (default 30 mins). It uses atomic renames via
/tmp/zeta-busfor concurrent-safe history tracking.