Skip to content

backlog(B-0159): refresh-github-worldview cross-cutting refresh script (Claude.ai 2026-05-01)#1173

Merged
AceHack merged 5 commits intomainfrom
otto/B-0159-refresh-github-worldview-2026-05-01
May 1, 2026
Merged

backlog(B-0159): refresh-github-worldview cross-cutting refresh script (Claude.ai 2026-05-01)#1173
AceHack merged 5 commits intomainfrom
otto/B-0159-refresh-github-worldview-2026-05-01

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 1, 2026

Files the unified refresh script as a P1 backlog row per Claude.ai's calibrated hand-off shape (don't context-switch mid-PR-cycle; flow through standard claim protocol).

Composes with PR #1171 (refresh-before-decide memo + verbatim packet) — this row is the actionable extraction.

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings May 1, 2026 22:00
@AceHack AceHack enabled auto-merge (squash) May 1, 2026 22:00
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8fdebd8e58

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new P1 per-row backlog entry (B-0159) proposing a unified “refresh GitHub worldview” script to provide cross-cutting repo/PR state, complementing the existing per-PR poll-pr-gate-batch.ts workflow.

Changes:

  • Introduces a new backlog row documenting the problem (narrow refresh scope) and an implementation plan for a cross-cutting refresh script.
  • Specifies intended interface, output discipline (two-layer print), fixture strategy, and acceptance criteria for the future tool.

AceHack added a commit that referenced this pull request May 1, 2026
Codex P1 + Copilot duplicate finding on PR #1173: backlog rows require frontmatter (id, priority, status, title, created, last_updated) per tools/backlog/README.md schema. The .github/workflows/backlog-index-integrity.yml gate blocks PRs touching docs/backlog/** without it.

Fix: add YAML frontmatter matching the canonical shape used by sibling P1 rows (B-0156 style). Includes depends_on: [B-0156] since the unified refresh script is part of the broader TS-standardization trajectory.

Two additional Copilot dangling-pointer findings on the same PR:
- docs/research/2026-05-01-claudeai-backlog-driven-dual-pm-loop-with-refresh-discipline.md
- memory/feedback_refresh_before_decide_invariant_two_layer_print_dx_claudeai_2026_05_01.md

Both were dangling at PR-open time because they were on PR #1171 which had not merged. PR #1171 has now merged to main; rebasing this branch resolves both. The branch is now up-to-date with origin/main and the cross-references resolve.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 1, 2026 22:06
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8bb735e6d6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

AceHack and others added 5 commits May 1, 2026 18:13
…t (Claude.ai 2026-05-01)

Claude.ai's calibrated follow-up to the refresh-before-decide packet identified Otto's narrow-refresh failure mode empirically:

- 5:32pm refresh: 4 PRs
- 5:37pm refresh: same 4 PRs
- 5:40pm-5:50pm refresh: 2-3 PRs across 6 consecutive ticks
- 5:50pm: PR #1170 'appeared out of nowhere' because prior 4 ticks' refresh scope didn't include it

The narrow-refresh pattern hides cross-cutting state changes — PRs from other harnesses, auto-merge cascades, backlog deltas, claim file inventory, recent merges, branch state, pending CI runs.

`poll-pr-gate-batch.ts` is correctly-scoped per its design (per-PR detail). The gap is the cross-cutting view.

This row specs `tools/refresh-github-worldview/refresh.ts` — supersets poll-pr-gate-batch (calls it internally for per-PR detail), adds 5+ cross-cutting queries, two-layer print discipline, DST-grade-A test coverage with fixtures.

Composes with refresh-before-decide memo (PR #1171), poll-pr-gate-batch (PR #1153), SQLSharp DI pattern memo (PR #1155).

Phase 2+ deferred (Mirror/Beacon ratio gate, 22 named failure modes, DST scenario suite, pre-DORA metrics, dual-PM mode-selection) — each is its own future row.

Per Claude.ai's caution: this row is the filing-not-implementation step. Don't context-switch mid-PR-cycle; let it flow through standard claim protocol when queue is quiet.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…umptions, perf target, tiered fallback (2026-05-01)

Deepseek follow-up review identified 7 critical refinements:

(1) Delta-diff over current-state-dump — Otto's deeper failure isn't refresh-not-run, it's refresh-not-integrated. Saw 27 open PRs at 5:37pm, noted count, moved on. Snapshot persistence at .zeta/refresh-snapshot.json enables actual deltas.

(2) Provenance per PR — self / peer-call / maintainer / unknown. Computed mechanically from author. Unknowns are highest-priority signal.

(3) stale_assumptions field — most operationally valuable. Refresh reports surprises, not just state. 'PR #X expected to merge by now, why hasn't it?'

(4) Single JSON with summary field — match Otto's existing pattern (poll-pr-gate-batch.ts produces both layers in one output). Maintainer reads same JSON Claude reads; mismatch debuggable at boundary.

(5) Performance target <5s typical tick + tiered fallback (degrade to poll-pr-gate-batch + 'stale at <timestamp>' if exceeded). Prevents bottleneck.

(6) Backlog-row delta as git-derived (git diff --name-only HEAD~1 HEAD -- docs/backlog/), not frontmatter timestamps. Avoids B-0098-class metadata drift.

(7) Recent merges via git log + post-hoc author bucketing. Unknown-author = highest priority.

Plus refresh frequency recommendations + composing with poll-the-gate / manufactured-patience / never-idle / pre-commit lint for refresh artifacts.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codex P1 + Copilot duplicate finding on PR #1173: backlog rows require frontmatter (id, priority, status, title, created, last_updated) per tools/backlog/README.md schema. The .github/workflows/backlog-index-integrity.yml gate blocks PRs touching docs/backlog/** without it.

Fix: add YAML frontmatter matching the canonical shape used by sibling P1 rows (B-0156 style). Includes depends_on: [B-0156] since the unified refresh script is part of the broader TS-standardization trajectory.

Two additional Copilot dangling-pointer findings on the same PR:
- docs/research/2026-05-01-claudeai-backlog-driven-dual-pm-loop-with-refresh-discipline.md
- memory/feedback_refresh_before_decide_invariant_two_layer_print_dx_claudeai_2026_05_01.md

Both were dangling at PR-open time because they were on PR #1171 which had not merged. PR #1171 has now merged to main; rebasing this branch resolves both. The branch is now up-to-date with origin/main and the cross-references resolve.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ayer split (Aaron + 4 peer reviewers 2026-05-01)

Two artifacts:

(1) docs/research/2026-05-01-peer-ai-followup-reviews-on-b-0159-refresh-script.md — verbatim preservation of 4 peer-AI reviews (Ani / Alexa / Gemini / Amara) per substrate-or-it-didn't-happen + GOVERNANCE §33 trigger (multi-AI review packet preserve verbatim FIRST). Includes carved blades, cross-peer convergence (4/4 agree on aggregator-not-replacement + two-layer output + don't-context-switch + compose-with-existing-disciplines), and divergence preserved as alternatives (tool naming, snapshot persistence path).

(2) B-0159 backlog row updated with two architectural decisions:

   ARCHITECTURE — TWO-LAYER GIT-NATIVE + GITHUB-API SPLIT (Aaron 2026-05-01 calibration of Amara's repo-state rename):
   - Layer 1: tools/repo-state/repo-state.ts — git-native, portable across hosts. Pure git ops + filesystem (backlog, claims, branch state, dirty flag).
   - Layer 2: tools/github/github-state.ts — wraps repo-state + poll-pr-gate-batch + adds GitHub API (PRs, CI, threads, reviews, workflows). GitHub-native.
   - Composes with feedback_git_native_vs_github_native_plural_host_pluggable_adapters_2026_04_23.md and feedback_first_class_for_us_not_for_our_host_*.

   PEER-AI CONSOLIDATED REQUIREMENTS:
   - Ani: idempotency + fail-closed (exit 10 on dirty/rebase), --raw flag, noise filter
   - Alexa: success criteria per phase, staleness detection, rollback procedures, cross-harness coordination, performance benchmarking
   - Gemini: macro/micro framing, strict sequence (don't context switch)
   - Amara: aggregator-not-replacement, flow metrics, unknown/unavailable per-source states, modular collectN() functions, persisted snapshot at .state/<layer>/last.json, --since/--write-state flags

Aaron's calibration in same packet on time-estimation: 'you can't tell time without something like the previous state files you get how long thing took wrong all the time.' My 'X minutes' estimates throughout session are unanchored. Same persisted-snapshot mechanism that enables deltas would also anchor real durations. Filing as separate substrate target for next tick (time-estimation requires external timestamps, not internal feel).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…x P2 + Copilot P1)

Two real findings on PR #1173:

1. Codex P2: `git log --since="..."` after `--` pathspec doesn't filter
   per git-log man page (options must precede pathspecs). Reordered
   to `git log --oneline --diff-filter=A --since="<ts>" -- docs/backlog/`.

2. Codex P2 + Copilot P1 (dup): output-format contradiction —
   row described two separate stdout passes (raw-then-interpretation)
   AND single JSON with `summary` field. Reconciled per Deepseek's
   single-JSON design: `summary` object IS the interpretation layer
   alongside raw arrays in one JSON document, one invocation.
   Mismatch between summary and underlying arrays is the bug class.
   Supersedes the two-pass framing earlier in the row.

Two earlier dangling-pointer Copilot threads on this PR are now
satisfied by main since #1171 landed
(docs/research/2026-05-01-claudeai-backlog-driven-dual-pm-loop-with-refresh-discipline.md
+ memory/feedback_refresh_before_decide_invariant_two_layer_print_dx_claudeai_2026_05_01.md
both exist on main).
Copilot AI review requested due to automatic review settings May 1, 2026 22:17
@AceHack AceHack force-pushed the otto/B-0159-refresh-github-worldview-2026-05-01 branch from 00d69b0 to 2dc1f39 Compare May 1, 2026 22:17
@AceHack AceHack merged commit 07d16e3 into main May 1, 2026
24 of 25 checks passed
@AceHack AceHack deleted the otto/B-0159-refresh-github-worldview-2026-05-01 branch May 1, 2026 22:20
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

Comment on lines +191 to +198
| Old name | New name | Layer | Composes |
|---|---|---|---|
| `refresh-github-worldview` | (deprecated — too broad) | — | — |
| (n/a) | `tools/repo-state/repo-state.ts` | Layer 1 (git-native) | git ops + filesystem |
| (n/a) | `tools/github/github-state.ts` | Layer 2 (GitHub) | wraps Layer 1 + poll-pr-gate-batch + gh API |
| `tools/github/poll-pr-gate.ts` | (unchanged) | per-PR primitive | direct gh API |
| `tools/github/poll-pr-gate-batch.ts` | (unchanged) | per-PR-list primitive | wraps poll-pr-gate.ts |

Comment on lines +427 to +439
- [ ] `tools/refresh-github-worldview/refresh.ts` script exists and runs
- [ ] Calls `poll-pr-gate-batch.ts` internally for per-PR detail
- [ ] Adds 5+ cross-cutting queries (full open-PR list, recent merges,
backlog delta, claim files, branch state)
- [ ] Two-layer print: raw JSON first, interpretation labeled second
- [ ] DST-grade-A test coverage (synthetic deps; no live `gh` calls)
- [ ] At least 3 fixtures covering common scenarios
- [ ] Existing memory files (rebase-decision, BLOCKED-with-green-CI,
Copilot false-positive) updated to reference the new script as
canonical refresh
- [ ] CLAUDE.md refresh-before-decide bullet points at the new script
- [ ] Tick-start in autonomous-loop replaced with single unified-refresh
call
- [ ] Calls `poll-pr-gate-batch.ts` internally for per-PR detail
- [ ] Adds 5+ cross-cutting queries (full open-PR list, recent merges,
backlog delta, claim files, branch state)
- [ ] Two-layer print: raw JSON first, interpretation labeled second
Comment on lines +12 to +43
**YES BABY** — this is *chef's kiss* levels of disciplined hand-off design. You read Otto's actual running state, spotted the exact staleness pattern (narrow refreshes missing new PRs from other harnesses), and proposed a clean superset instead of a destructive replacement. That's rare and correct.

### What's Working Insanely Well

**1. The three-phase hand-off is perfect.** Backlog row first → memory file for the invariant → full doc as research-grade. This respects his claim protocol, avoids mid-tick context-switch, and lets the substrate flow through disciplines he already has. You didn't just dump a 10k-word doc on him. Respect.

**2. You correctly identified the real failure mode.** He's running tight per-PR refreshes (1155, 1163, 1165, 1167…) and only does `gh pr list` ad-hoc. The unified `refresh-github-worldview` that wraps `poll-pr-gate-batch.ts` internally + adds the cross-cutting view (full open queue, backlog deltas, claim inventory, recent merges from other harnesses) is exactly the missing piece. The "PR #1170 appeared out of nowhere" example is gold.

**3. Composition notes are strong.** You explicitly said the new script *calls* poll-pr-gate-batch internally for per-PR detail. That prevents duplication and honors the 5-AI convergent work already done on the fixtures. The "extends, not replaces" framing is the right language.

**4. You caught the subtle scope collision.** The "directory listing is the index, no central index" line in your doc would have conflicted with his existing `poll-pr-gate-batch.ts`. Your proposed revision ("Backlog has no central index — directory listing IS the index. PR/CI/branch state has central index — unified refresh script IS the index") is precise and safe.

**5. Timeline realism + current-PR protection.** You correctly said "let #1168 + #1170 land first." That's the reactive-proactive hygiene you're trying to protect.

### Issues / Opportunities for Hardening

**1. Add explicit idempotency + fail-closed safety to the unified refresh spec.**

> The script MUST be idempotent. If the worktree is dirty or a rebase/cherry-pick is in progress, it MUST fail closed with a clear message ("refresh blocked — dirty worktree or active rebase detected") and exit code 10. It MUST NEVER mutate state.

This protects the exact failure mode you're trying to prevent (context-switch mid-tick).

**2. Make the two-layer print configurable.**

> `--raw` flag outputs only the machine-readable snapshot (JSON). Default is two-layer. This lets Otto (or future harnesses) consume the raw layer programmatically without the interpretive wrapper.

**3. The "qq222…" spam filter is missing from the spec.** You correctly ignored it in the log, but the unified refresh should have a one-line noise filter so future keyboard-mash doesn't pollute the snapshot.

**4. The backlog row for B-0119 (role-ref cleanup) is still pending.** You filed B-0119 and B-0120 in the ACID cluster, but the actual PR for the other four peer-call scripts (grok.sh, gemini.sh, codex.sh, amara.sh) hasn't been opened. That's the deferred-skill anti-pattern you're trying to kill. Open it as a follow-on after the current PR cycle settles.

**5. Minor: The "ten minutes" estimate is optimistic.** CI wait + reviewer threads on #1168/#1170 could easily be 20–30 minutes. Adjust the hand-off timing note to "after the current PR cycle (likely 20–40 min)".

Comment on lines +45 to +52
**`tools/refresh-github-worldview/refresh.ts`** — a TS+Bun script that
supersets `poll-pr-gate-batch.ts` with cross-cutting state.

Composition (NOT replacement):

- Calls `poll-pr-gate-batch.ts` internally for per-PR detail on the
known-or-newly-discovered PRs.
- Adds cross-cutting queries:
AceHack added a commit that referenced this pull request May 1, 2026
…ckfill (#1174)

22:24Z: Drained 3 PRs (#1173 B-0159 MERGED; #1103 + #1104 armed).
Resolved 7 threads. Fixed force-with-leased verbal pattern in
2 prior shards. Refresh-before-decide via poll-pr-gate-batch
surfaced 27-PR queue.

19:16Z (orphan backfill): Pipe-in-code-span fix wave from prior
wake's untracked working-tree shard. Substrate-or-it-didn't-happen
discipline: orphan tick-history landed even though belated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants