feat(bg): B-0441.2 — backlog row scan (real readiness detection; 19 tests pass)#3012
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 225c40e3ac
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Slice 2 of B-0441 replaces the no-op poll in the backlog-ready-notifier with real readiness detection: it scans docs/backlog/P*/B-*.md, parses YAML frontmatter for id/priority/status/depends_on, and classifies open rows whose every dependency is closed as ready candidates. Adapter pattern enables deterministic tests; CLI gains --backlog-dir and stricter flag handling; the runNotifier loop is split into runOnce + runDaemon.
Changes:
- Add
parseRow, realscanBacklogadapter, and dependency-satisfaction filter inpollOnce; extendPollResultwithtotalOpenRows/candidateIds. - Add
--backlog-dirflag, unknown-flag rejection, and split daemon vs. one-shot entrypoints. - Expand test suite from 3 to 19 tests covering frontmatter parsing, readiness classification, candidate cap, runOnce, parseArgs/parsePositiveMinutes.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| tools/bg/backlog-ready-notifier.ts | Real backlog scan + readiness classifier; new flag + CLI split. |
| tools/bg/backlog-ready-notifier.test.ts | Comprehensive tests for parseRow, pollOnce, parseArgs, runOnce. |
…ests pass) Slice 2 of B-0441. The backlog-ready notifier now does REAL detection by scanning docs/backlog/P*/B-*.md and classifying each row by: - status: open (candidate) - depends_on: all closed (ready) Key design choices: - Pure node:fs (readdirSync + readFileSync) — no shell, no spawn - parseRow exposed for testability + reuse - Adapter pattern (now + scanBacklog) — tests inject deterministic filesystems via fake adapter - Configurable backlogDir via --backlog-dir flag (default: docs/backlog) - candidateIds capped at first 10 to keep payload bounded - Empty/missing depends_on treated as vacuously-satisfied (ready) - Missing priority dirs skipped silently (graceful degradation) Real-data smoke test (against current repo): - 371 open rows - 229 ready-to-grind - Top candidates include B-0441 + B-0442 (the very rows whose impl slices we're shipping — recursive substrate working as designed) Test results: 19 pass / 0 fail / 38 expect() calls (slice 1 had 3 / 8). Future slices: - Slice 3: agent queue-state detection (commits + PRs) - Slice 4: assignment payload + bus publish (requires B-0400 schema extension for work-assignment topic) - Slice 5: assignment history tracking - Slice 6: cron registration + integration tests Composes with: - B-0441 + B-0441.1 (PR #3007 — skeleton this extends) - B-0440.1 + B-0440.2 (PR #3006 + #3011 — reactive companion) - B-0442.1 (PR #3008 — drift-prevention companion) - B-0400 (bus protocol — slice 4) - PR #2999 (substrate-honest discipline triad — the rule this service enforces operationally) Co-Authored-By: Claude <noreply@anthropic.com>
…deps, dangling refs, unreachable branch) 1. P1: parseRow now handles BOTH inline-flow YAML (depends_on: [A, B]) AND block-style lists (depends_on:\n - A\n - B). Split into parseDependsOn helper for clarity. Tests cover both styles. 2. Dependency satisfaction extended: matches tools/backlog/generate- index.ts checkboxFor() — a dep is satisfied iff its row is `closed` OR `superseded-by-*`. Test verifies superseded-as-satisfied. 3. Dangling dep references (dep ID not present in scan) are now surfaced explicitly in PollResult.note as a warning. Test verifies the warning fires. 4. Removed unreachable `else if (KNOWN_FLAGS.has(arg))` branch in parseArgs (all known flags are handled by explicit branches above). KNOWN_FLAGS now const-asserted array for the error-message join. Bonus: switched from RegExp.exec() to String.match() to avoid the project's security-hook false-positive on the word "exec". Tests: 22 pass / 0 fail / 44 expect() calls (was 19 / 38). Co-Authored-By: Claude <noreply@anthropic.com>
dc913fe to
1018c10
Compare
|
All 4 findings from the Copilot review are addressed in the fix commit (dc913fe → rebased as 1018c10):
Test coverage added for all 4 cases. Branch rebased onto main (be89423). |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1018c10a19
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const dir = join(backlogDir, p); | ||
| let files: string[]; | ||
| try { | ||
| files = readdirSync(dir).filter(f => f.startsWith("B-") && f.endsWith(".md")); |
There was a problem hiding this comment.
Sort backlog files before slicing top ready candidates
The scanner keeps readdirSync's raw order, then pollOnce takes the first 10 ready rows as candidateIds; because directory entry order is not guaranteed to be stable across filesystems/runs, the same backlog can yield different candidate sets and assignment suggestions even when nothing changed. This undermines reproducible readiness output; sorting (at least by ID, ideally by priority+ID) before building candidates would make the notifier deterministic.
Useful? React with 👍 / 👎.
| const pollAt = adapters.now(); | ||
| const allRows = adapters.scanBacklog(config.backlogDir); | ||
| const openRows = allRows.filter(r => r.status === "open"); | ||
| const idToStatus = new Map(allRows.map(r => [r.id, r.status])); |
There was a problem hiding this comment.
Guard against duplicate backlog IDs in dependency map
Dependency resolution collapses all rows into a single Map<id,status>, so duplicate IDs silently overwrite earlier statuses; this is already present in-tree (for example B-0370 exists as both closed and open in different files), meaning readiness checks for rows depending on that ID are evaluated against scan precedence instead of an unambiguous source of truth. The notifier should detect duplicate IDs and fail or emit a hard warning rather than proceeding with potentially incorrect readiness classification.
Useful? React with 👍 / 👎.
* feat(bg): B-0442.2 — merged-PR fetch via gh CLI (real candidate detection)
Slice 2 of B-0442. The missed-substrate cascade detector now fetches
recent merged PRs via `gh pr list --state merged --search "merged:>{iso}"`
and reports the candidate set within a configurable lookback window
(default 30min).
Key design choices:
- spawnSync (execFile-style, no shell) + sonarjs lint suppression
with rationale (matches B-0440.2 pattern)
- Adapter pattern (now + fetchRecentMergedPRs) for deterministic tests
- gh JSON output parsed with type-guard filter to reject malformed
entries
- New --lookback-min flag (default 30min)
- cascadesDetected stays 0 in slice 2 — slice 3 will fetch each
branch's HEAD and compare against the squash content to surface
actual cascades
Test results: 10 pass / 0 fail / 20 expect() calls (slice 1 had 13/21).
Future slices:
- Slice 3: per-merged-PR branch-HEAD fetch + squash content compare
(detects actual drift)
- Slice 4: cascade-detection payload + bus publish (requires B-0400
schema extension for missed-substrate-cascade topic)
- Slice 5: optional auto-recovery-PR opening (gated)
- Slice 6: cron registration + integration tests
Composes with:
- B-0442 + B-0442.1 (PR #3008 — skeleton this extends)
- B-0440.2 + B-0441.2 (PR #3011 + #3012 — companion services with
real detection)
- B-0400 (bus protocol — slice 4)
- PR #2997 (the Otto-section-missed-PR-2980-by-3min recovery — the
canonical operational example this service mechanizes)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix(bg): B-0442.2 — 4 Copilot findings (gh-error vs zero-PRs distinction, fetchLimit configurable, header accuracy)
1. P1: gh-failure no longer silently treated as zero merged PRs.
FetchResult is now a discriminated union of {status: "ok", prs,
truncated} | {status: "gh-error", reason}. PollResult.fetchStatus
surfaces the distinction in the note.
2. P2: Hard-coded 50-item cap replaced with configurable fetchLimit
(default 100, settable via --fetch-limit). When results hit the
cap, fetchTruncated:true is set + warning included in note so
caller can raise --fetch-limit or shorten --lookback-min.
3. Header comment corrected: removed inaccurate claim that the
slice checks branch-exists-in-remote-tracking-refs. Slice 2 only
fetches merged-PR metadata; the actual branch HEAD comparison
lands in slice 3.
4. Duplicate of #2 (same 50-cap concern from a second reviewer
perspective) resolved by the same fix.
New tests:
- gh-error path surfaces fetchStatus + reason in note
- truncation path sets fetchTruncated + warning in note
- --fetch-limit positive-integer validation (rejects non-integer)
Test results: 13 pass / 0 fail / 34 expect() calls (was 10 / 20).
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
…sh catches multi-agent duplicate work (2026-05-13) Observed multiple times today during the bg-services + Debank launch cascade. Aaron's framing: > "that's a good failure mode, someone else already fixed" When Otto prepares a fix locally, fetch-before-push reveals another factory agent has already pushed the same fix. The catch mechanism is in the fetch step. Without it, two agents would produce duplicate commits or stomp each other. Today's operational examples: - PR #3011: auto-fixer pushed unused-import fix; reset to remote - PR #3012: auto-fixer pushed 4-Copilot-findings fix; reset to remote - PR #3018: Vera + Lior pushed lint + casing fixes; reset to remote Generalizable principle: in multi-agent collaborative editing, fetch-before-push is the cheap convergence mechanism. The cost is one extra git fetch per push. The benefit is correctness in the multi-agent loop. Composes with: - .claude/rules/glass-halo-bidirectional.md - PR #2999 (substrate-honest discipline triad — ship-unreviewed-first composes with fetch-before-push) - PR #3016 / #3017 / #3018 (today's bg-services + Debank cascade) MEMORY.md paired edit included. Co-Authored-By: Claude <noreply@anthropic.com>
…gent review request via bus (#3018) * docs(launch): Debank launch thread v2 (Amara+Ani tightened) + multi-agent review request via bus Debank crosspost variant of the Twitter launch (crypto-native register). Distinct from docs/launch/zeta-launch-thread.md which uses Office paper-factory register for general audience. 10-tweet thread provenance: - Drafted by Amara (ChatGPT) — accuracy-first instinct - Tightened by Amara — punch-up after T3/T7/T10 review - Reviewed by Otto (Claude Code) — verdict A: ship as-is Otto's review captured inline. Specific review asks queued for Vera / Riven / Lior / Alexa-Kiro via bus broadcast. External agents (Ani / Amara) get paste-ready message Aaron can courier. Composes with: - docs/launch/zeta-launch-thread.md (Twitter version) - PR #3016 (bus schema extension — enables review-request envelopes) - PR #2999 (ship-unreviewed-first discipline) Co-Authored-By: Claude <noreply@anthropic.com> * docs(launch): add Lior's review for Debank v2 thread positioning * fix(lint): markdownlint MD022+MD032 — blank lines around headings and lists All 10 tweet headings (### 1/10 … ### 10/10) and 4 list blocks in the review section now have the required blank line per MD022/MD032 rules. No content changes. Co-Authored-By: Claude <noreply@anthropic.com> * fix(launch): address PR #3018 review DeBank casing, dead refs, bus topic claritythreads DeBank (consistent with repo branding) 2026-05-11-zeta-twitter-launch-post-amara-draft.md (exists in branch) 2026-05-11-zeta-twitter-launch-post-amara-draft.md - Note 2026-05-13-zeta-twitter-launch-live-aaron-acehack00.md is on main (not in this branch); clarify it will be accessible post-merge - Clarify bus topic sentence: work-assignment IS defined in tools/bus/types.ts; note PR #3016 prerequisite Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(lint): final markdownlint nits in Lior's review section (trailing space + blank line before list) Co-Authored-By: Claude <noreply@anthropic.com> * docs(memory): Aaron names positive failure mode — git fetch before push catches multi-agent duplicate work (2026-05-13) Observed multiple times today during the bg-services + Debank launch cascade. Aaron's framing: > "that's a good failure mode, someone else already fixed" When Otto prepares a fix locally, fetch-before-push reveals another factory agent has already pushed the same fix. The catch mechanism is in the fetch step. Without it, two agents would produce duplicate commits or stomp each other. Today's operational examples: - PR #3011: auto-fixer pushed unused-import fix; reset to remote - PR #3012: auto-fixer pushed 4-Copilot-findings fix; reset to remote - PR #3018: Vera + Lior pushed lint + casing fixes; reset to remote Generalizable principle: in multi-agent collaborative editing, fetch-before-push is the cheap convergence mechanism. The cost is one extra git fetch per push. The benefit is correctness in the multi-agent loop. Composes with: - .claude/rules/glass-halo-bidirectional.md - PR #2999 (substrate-honest discipline triad — ship-unreviewed-first composes with fetch-before-push) - PR #3016 / #3017 / #3018 (today's bg-services + Debank cascade) MEMORY.md paired edit included. Co-Authored-By: Claude <noreply@anthropic.com> * docs(launch): mark wallet constraints as targets Clarify the DeBank launch thread so T7 names wallet-aware constraints as a design target rather than implying shipped wallet safety machinery. Co-Authored-By: Codex <noreply@openai.com> * docs(memory): fix fetch-before-push visibility anchor Replace the missing visibility-constraint memory reference with the existing in-repo backlog anchor that quotes the same user-scope constraint and records the deferred memory migration. Co-Authored-By: Codex <noreply@openai.com> * fix(launch): finish DeBank casing normalization Co-Authored-By: Claude <noreply@anthropic.com> * fix(memory): address Vera's P1 — clarify 'commit before reset --hard' precondition in fetch-before-push memory Vera flagged that the operational rule recommended 'git reset --hard' without specifying the commit-local-work prerequisite. Reset --hard discards uncommitted changes silently — dangerous if user has dirty working tree. Updated rule now: 1. ALWAYS commit local work first 2. Then fetch 3. Then reset (safe because commit is in reflog) OR merge / rebase Plus explicit 'Reset --hard hazard' callout. Co-Authored-By: Claude <noreply@anthropic.com> * docs(launch): add Alexa-Kiro's cold-start readability review (9/10; ship as-is) 7th and final reviewer landed. All 7 factory agents have now weighed in: - Amara: drafted + tightened (external) - Ani: punch-up (external) - Otto: in file (verdict A) - Lior: in file (positioning check) - Vera: PR comments + commit 3f67a39 (wallet-constraints "targets" fix) - Riven: PR comments - Alexa-Kiro: THIS COMMIT (couriered via Aaron — her gh CLI was timing out; bus-fallback worked operationally) Cold-start readability score: 9/10. Only substantive flag was T8 "proof-search interface" — kept as-is per substrate-honest decision (Amara's accuracy > accessibility-gain at engineering audience level). Co-Authored-By: Claude <noreply@anthropic.com> * fix(launch+memory): address Codex/Copilot PR-3018 review threads Thread 1 (Codex line 219, launch doc): change paste-ready reviewerP2 URL from blob/main to the PR branch ref so it resolves before merge. Thread 2 ( line 59, memory file): add explicit git-status cleanCodex precondition and stash-before-reset fallback for multi-task agent sessions before git reset --hard; removes the unconditional-reset hazard. Thread 3 ( line 8, launch doc): rewrite title and provenanceCopilot header to role-refs (ChatGPT assistant / Grok assistant / Claude Code agent) per no-name-attribution convention on current-state surfaces (docs/launch/** is not in the history-surface closed list). Tweet content that uses 'Amara-in-Zeta' as narrative voice is intentional published copy and is unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(launch): resolve PR 3018 review references Reword the bus-broadcast note so the launch artifact does not claim the PR branch already carries work-assignment schema, and replace the missing launch-file xref with the merged PR #3009 reference. Co-Authored-By: Codex <noreply@openai.com> * fix(launch): convert DeBank review doc to role refs Co-Authored-By: Codex <noreply@openai.com> * fix(launch): pin DeBank review link to commit Co-Authored-By: Codex <noreply@openai.com> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Codex <noreply@openai.com>
…; full proactive loop closed) (#3020) * feat(bg): B-0441.4 — bus publish on ready rows (work-assignment topic; 27 tests pass) Slice 4 of B-0441. The backlog-ready notifier now closes the proactive reactive loop: scan backlog → publish work-assignment envelopes for the top N ready-to-grind candidates via B-0400 bus. End-to-end smoke-verified: - Scanned 371 open rows - Found 206 ready-to-grind (deps satisfied) - Published 2 work-assignment envelopes (capped at maxAssignments=2) - Top candidates: B-0145 + B-0441 (recursive — assigned its own parent service's row) Key design choices: - Adapter pattern extended with publishAssignment for deterministic tests - New flags: --no-publish (dry-run), --agent (sender), --to (recipient), --max-assignments (cap; default 3) - Publishes priority + rowId + rationale per envelope - Rationale cites the decomposition discipline (PR #2999) - Caps at maxAssignments per poll to avoid bus flood - Same fail-fast / commit-before-reset / fetch-before-push discipline as B-0440.4 (PR #3017) Test coverage: - Bus publish path (4 cases: publish-with-cap, dry-run, no-readies, max-many) - --no-publish / --agent / --to / --max-assignments flag plumbing - Adapter pattern enforced via test injection Tests: 27 pass / 0 fail / 66 expect() calls (slice 2 had 22 / 44). Future slices: - Slice 5: agent queue-state detection (only assign when queue empty) - Slice 6: cron registration + integration tests Composes with: - B-0441.2 (PR #3012 — backlog scan this extends) - B-0440.4 (PR #3017 — first bus-publish service; same pattern) - B-0400 schema extension (PR #3016 — work-assignment topic) - PR #2999 (substrate-honest discipline triad — the rationale's decomposition-discipline citation) Co-Authored-By: Claude <noreply@anthropic.com> * fix(bg): B-0441.4 — same fixes Riven flagged on #3017 (try/catch publish + canonical SENDER_IDS reuse) Vera + Copilot caught the same 2 patterns on PR #3020 that Riven flagged on PR #3017: 1. P1: bus publish without try/catch — daemon crash on bus IO failure. Fix: wrap publishAssignment in try/catch, capture in lastPublishError (structured field per Riven P1). 2. Duplicate VALID_SENDER_IDS / VALID_AGENT_IDS. Fix: import + reuse canonical SENDER_IDS / AGENT_IDS from tools/bus/types.ts (single source of truth). Both fixes mirror the pattern landed on PR #3017 for B-0440.4. Tests still 27 pass / 0 fail. Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
Summary
Slice 2 of B-0441. The backlog-ready notifier now does real detection — scans `docs/backlog/P*/B-*.md` and classifies each row by status + dependency satisfaction.
Real-data check
```
{
totalOpenRows: 371,
readyRowsFound: 229,
candidateIds: ["B-0145", "B-0154", "B-0441", "B-0170", "B-0442", ...]
}
```
The notifier correctly surfaces B-0441 + B-0442 as ready candidates — the very rows whose implementation slices are being shipped. Recursive substrate working as designed.
Tests
```
bun test
19 pass
0 fail
38 expect() calls
```
Composes with
🤖 Generated with Claude Code