-
Notifications
You must be signed in to change notification settings - Fork 1
absorb: multi-AI feedback on threading + PR-liveness micro-class (Deepseek + Amara, 2026-04-29 packet 2) #815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
1eebc2c
86ac80b
fa8b381
5853d54
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,160 @@ | ||
| --- | ||
| id: B-0103 | ||
| priority: P2 | ||
| status: open | ||
| title: Computed-metadata-discipline — unified lint consolidating B-0098 + B-0099 + filename-timestamp drift | ||
| tier: factory-hygiene | ||
| effort: M | ||
| ask: Multi-AI synthesis packet 2026-04-29 (Amara filter — promote individual P3 metadata-drift items to single P2) | ||
| created: 2026-04-29 | ||
| last_updated: 2026-04-29 | ||
| composes_with: [B-0098, B-0099, B-0102] | ||
| tags: [ci-lint, factory-hygiene, derived-metadata, manual-drift-class, mechanical-guard, p2-promotion] | ||
| --- | ||
|
|
||
| # Computed-metadata-discipline — unified lint | ||
|
|
||
| The 2026-04-29 session arc surfaced **three** instances of the | ||
| same failure class — agent-authored metadata that drifted from | ||
| derived truth: | ||
|
|
||
| 1. **Tick-ordinal drift** — shard prose claims "twenty-second tick" | ||
| but file order says twenty-first (B-0098). | ||
| 2. **PR-count drift** — shard prose claims "30 PRs total this | ||
| session arc" but git log says 28 (B-0099). | ||
| 3. **Shard-filename-vs-row-timestamp drift** — a shard | ||
| filename timestamp and its row timestamp diverged | ||
| (caught by Codex P1 on PR #809; the specific shard was | ||
| subsequently corrected so a literal current-state quote | ||
| would mislead). | ||
|
|
||
| Three instances in one session is enough signal to consolidate | ||
| the family into a single P2 mechanical guard rather than three | ||
| parallel P3 lints. | ||
|
|
||
| ## Canonical rule | ||
|
|
||
| ```text | ||
| Agent-authored metadata must match derived truth. | ||
| If the truth can be computed, compute it or lint it. | ||
| ``` | ||
|
|
||
| ## Examples (the drift-prone metadata claims this lint covers) | ||
|
|
||
| | Claim | Derived from | | ||
| |---|---| | ||
| | filename timestamp (`HHMMZ.md`) | row timestamp's `HH:MM` | | ||
| | tick ordinal ("twenty-second tick") | sorted shard position in directory | | ||
| | session PR total ("30 PRs") | `gh pr list` query or `git log` count | | ||
| | branch base ("based on main") | explicit ref SHA | | ||
| | "this is the Nth fix" | git log count of similar commits | | ||
| | PR head/base SHA claims | `gh pr view --json headRefOid,baseRefOid` | | ||
|
|
||
| ## Boundary — what this lint does NOT apply to (Claude.ai's catch) | ||
|
|
||
| The rule fires only on agent-authored prose claiming | ||
| **exact equivalence with a derivable substrate truth**: | ||
| ordinals, counts, timestamps, SHAs, branch bases, PR states. | ||
|
|
||
| The rule does **not** fire on: | ||
|
|
||
| - Human summaries ("this round produced strong substrate") | ||
| - Interpretations or labels ("the loop has converged on | ||
| steady-state") | ||
| - Subjective qualifiers ("approximate", "roughly") | ||
| - Prose that intentionally summarizes an automatically-derived | ||
| fact rather than mirroring it | ||
|
|
||
| Without this boundary, the lint becomes Goodhart bait: every | ||
| human-readable summary against literal field values would | ||
| flag as drift. The boundary preserves prose value while | ||
| catching only **claims of correspondence**. | ||
|
|
||
| ## Distilled keepers | ||
|
|
||
| ```text | ||
| Events are written. | ||
| Metadata is computed. | ||
| Claims are checked against derived truth. | ||
| ``` | ||
|
|
||
| ## Implementation sketch (single lint, multiple checks) | ||
|
|
||
| The pseudocode below is robust against (a) filenames with | ||
| spaces / special chars (NUL-delimited iteration), (b) the | ||
| multiple legitimate shard-name shapes documented in | ||
| `docs/hygiene-history/ticks/README.md` (`HHMMZ.md`, | ||
| `HHMMZ-NN.md`, `HHMMSSZ-<short-hash>.md`). | ||
|
|
||
| ```bash | ||
| #!/usr/bin/env bash | ||
| # tools/lint/metadata-drift-check.sh | ||
| # Run on PR diffs touching tick-history shards or backlog rows. | ||
| # | ||
| # REQUIRES BASH (not strict POSIX): uses `[[ ... =~ ... ]]`, | ||
| # `BASH_REMATCH`, `read -d ''`, and process substitution | ||
| # `< <(...)`. The factory's 4-shell portability target | ||
| # (macOS bash 3.2 / Ubuntu bash / git-bash / WSL) all | ||
| # support these. If a strict POSIX rewrite becomes | ||
| # necessary later (e.g., busybox `ash` runners), use | ||
| # `awk` + `case` instead. | ||
|
|
||
| # Check 1 — filename HHMM matches row timestamp HH:MM. | ||
| # | ||
| # NUL-delimited iteration to survive whitespace/newlines in | ||
| # paths; restrict pathspec to the literal directory rather | ||
| # than relying on `**` magic which is not reliably enabled. | ||
| while IFS= read -r -d '' shard; do | ||
| shard_base=$(basename "$shard" .md) | ||
| # Accept HHMMZ, HHMMZ-NN, HHMMSSZ-<suffix>, HHMMZ-<short-hash>. | ||
| if [[ "$shard_base" =~ ^([0-9]{4})([0-9]{2})?Z(-[A-Za-z0-9._-]+)?$ ]]; then | ||
| filename_hhmm="${BASH_REMATCH[1]}" | ||
| else | ||
| warn "$shard: unsupported shard-name shape; cannot extract HHMM" | ||
| continue | ||
| fi | ||
| row_hhmm=$(head -1 "$shard" | grep -oE 'T[0-9]{2}:[0-9]{2}' | tr -d 'T:') | ||
| [[ "$filename_hhmm" == "$row_hhmm" ]] || warn "$shard: filename $filename_hhmm vs row $row_hhmm" | ||
| done < <(git diff --name-only -z "$BASE..$HEAD" -- docs/hygiene-history/ticks/) | ||
|
|
||
|
AceHack marked this conversation as resolved.
|
||
| # Check 2 — claimed ordinal matches file position (only when prose contains ordinal words) | ||
| # Check 3 — claimed PR count matches gh / git query (only when prose contains "N PRs total") | ||
| # Check 4 — branch-base claims cite explicit SHA | ||
| ``` | ||
|
|
||
| ## Why P2 (vs three separate P3s) | ||
|
|
||
| The pattern recurred 3x in 24 hours — strong signal it would | ||
| recur again. Single P2 lint: | ||
|
|
||
| - Reduces total surface area (one CI check, one set of regex | ||
| rules, one file to maintain). | ||
| - Catches all four drift sub-classes uniformly. | ||
| - Aligns with Amara's framing: "If metadata can be derived, | ||
| do not trust agent-authored prose." | ||
|
|
||
| P2 (factory hygiene, can-be-deferred but desirable) rather than | ||
| P0/P1 (blocking) because the drift is caught manually within | ||
| 1-2 ticks via review pipeline; the lint accelerates detection | ||
| but doesn't unblock anything currently broken. | ||
|
|
||
| ## Composes with | ||
|
|
||
| - B-0098 (tick-ordinal-continuity lint) — subsumed. | ||
| - B-0099 (PR-count-projection-not-narrated) — subsumed. | ||
| - B-0102 (PR-liveness race) — sibling agent-asserted-state | ||
| discipline. | ||
| - `memory/feedback_bare_main_ambiguity_automation_discipline_explicit_refs_required_amara_2026_04_29.md` | ||
| — same computed-vs-narrated rule at the git-ref layer. | ||
|
|
||
| ## Migration path | ||
|
|
||
| When this P2 lands as active work: | ||
|
|
||
| 1. Implement the single `tools/lint/metadata-drift-check.sh` | ||
| covering all 3+ sub-classes. | ||
| 2. Wire into `.github/workflows/gate.yml` (or sibling). | ||
| 3. Mark B-0098 + B-0099 as superseded-by-B-0103 in their | ||
| frontmatter. | ||
| 4. Remove ordinal-word + PR-count-prose from existing tick | ||
| shards if the lint catches them as drift candidates. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,192 @@ | ||
| --- | ||
| id: B-0102 | ||
| priority: P3 | ||
| status: open | ||
| title: PR-liveness race during merge cascade — micro-class rename + mechanical guard + recovery-note format | ||
| tier: research-grade | ||
| effort: S | ||
| ask: Multi-AI synthesis packet 2026-04-29 (Deepseek + Amara filter on Otto's micro-class) | ||
| created: 2026-04-29 | ||
| last_updated: 2026-04-29 | ||
| composes_with: [B-0103] | ||
| tags: [github-platform, force-push, pr-aliveness, merge-cascade, micro-class-refinement] | ||
| --- | ||
|
|
||
| # PR-liveness race during merge cascade — refinement | ||
|
|
||
| The 2026-04-29 autonomous-loop session arc surfaced a real GitHub | ||
| operational trap: PR #806 was unexpectedly auto-closed by GitHub | ||
| 1 second after PR #808 merged, while #806's branch had been | ||
| freshly rebased + force-pushed. The branch still had 476 lines of | ||
| unique unmerged substrate. The original micro-class name | ||
| `force-push-triggers-pr-auto-close` overclaimed GitHub internals. | ||
|
|
||
| ## Better naming (per Amara's filter) | ||
|
|
||
| ```text | ||
| pr-liveness-race-during-merge-cascade | ||
| ``` | ||
|
|
||
| or empirically: | ||
|
|
||
| ```text | ||
| force-push-during-merge-cascade can collapse PR uniqueness | ||
| ``` | ||
|
|
||
| Reason: the dangerous condition is not force-push alone — it's | ||
| `history rewrite + active base movement + GitHub PR | ||
| reachability/diff computation`. GitHub's "indirect merge" | ||
| detection (head reachable from base = auto-merge marker) can | ||
| race with mid-cascade rebases. | ||
|
|
||
| ```text | ||
| This is an observed probabilistic race, NOT a deterministic | ||
| GitHub rule. The guard remains in force even if a future | ||
| force-push happens not to close the PR — one survival is not | ||
| evidence the race retired. | ||
| ``` | ||
|
|
||
| ## Operational rule | ||
|
|
||
| ```text | ||
| Do not rebase or force-push open tick-history PR branches | ||
| while adjacent auto-merge PRs are landing. | ||
|
|
||
| Branch protection "up-to-date" is a merge-readiness gate. | ||
| PR-aliveness is a separate head/base reachability and diff | ||
| invariant. Do not confuse them. | ||
| ``` | ||
|
|
||
| ## Pre-flight: cascade detection (run BEFORE rebase/force-push) | ||
|
|
||
| ```bash | ||
| # Query active auto-merge PRs on the same base branch. | ||
| gh pr list --state open \ | ||
| --json number,baseRefName,headRefName,autoMergeRequest,mergeStateStatus,title \ | ||
| --jq '.[] | select(.baseRefName == "main" and .autoMergeRequest != null)' | ||
|
Comment on lines
+64
to
+66
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The cascade detector relies on Useful? React with 👍 / 👎. |
||
| ``` | ||
|
|
||
| If any PRs are returned (active cascade), defer the rebase/ | ||
| force-push until the cascade drains. The detection is | ||
| mechanical — Otto remembering the cascade state is exactly | ||
| the failure mode that produced this incident. | ||
|
|
||
| ## Mechanical guard (before any force-push/rebase of an open PR) | ||
|
|
||
| The guard uses a per-run identifier to avoid two concurrent | ||
| ticks overwriting each other's evidence (parallel-agent | ||
| future-proofing) — Claude.ai's catch. | ||
|
|
||
| ```bash | ||
| PR=<number> | ||
| RUN_ID="$(date -u +%Y%m%dT%H%M%SZ)-$$" | ||
|
|
||
| gh pr view "$PR" \ | ||
| --json number,state,headRefName,headRefOid,baseRefName,baseRefOid,mergeStateStatus,autoMergeRequest,isDraft,title \ | ||
| > "/tmp/pr-$PR-$RUN_ID-before.json" | ||
| # Refresh base ref before computing uniqueness — during a merge | ||
| # cascade `origin/main` can lag the actual base by several | ||
| # seconds (Codex's catch). The captured `baseRefOid` from | ||
| # `gh pr view` is the canonical base for this PR; use it | ||
| # directly rather than `origin/main`. | ||
| git fetch --no-tags origin | ||
| BASE_BEFORE=$(jq -r '.baseRefOid' "/tmp/pr-$PR-$RUN_ID-before.json") | ||
| git log --oneline "${BASE_BEFORE}..HEAD" > "/tmp/pr-$PR-$RUN_ID-unique-commits-before.txt" | ||
| git diff --stat "${BASE_BEFORE}...HEAD" > "/tmp/pr-$PR-$RUN_ID-diff-before.txt" | ||
|
|
||
| # ... do the rebase / force-push ... | ||
|
|
||
| # Wait for GitHub's API to converge to the local HEAD before | ||
| # classifying. GitHub's PR state computation is async; querying | ||
| # immediately after a push can return stale headRefOid (Gemini's | ||
| # catch). Poll up to 30s. | ||
| LOCAL_HEAD="$(git rev-parse HEAD)" | ||
| for i in 1 2 3 4 5 6; do | ||
| GH_HEAD="$(gh pr view "$PR" --json headRefOid --jq .headRefOid 2>/dev/null || true)" | ||
| [ "$GH_HEAD" = "$LOCAL_HEAD" ] && break | ||
| sleep 5 | ||
| done | ||
| if [ "$GH_HEAD" != "$LOCAL_HEAD" ]; then | ||
| echo "GitHub PR headRefOid did not converge to local HEAD after 30s; stop classification" | ||
| exit 1 | ||
| fi | ||
|
|
||
| gh pr view "$PR" \ | ||
| --json number,state,headRefName,headRefOid,baseRefName,baseRefOid,mergeStateStatus,autoMergeRequest,isDraft,title \ | ||
| > "/tmp/pr-$PR-$RUN_ID-after.json" | ||
| # Refresh + use the captured base again (may have advanced | ||
| # during the operation). | ||
| git fetch --no-tags origin | ||
| BASE_AFTER=$(jq -r '.baseRefOid' "/tmp/pr-$PR-$RUN_ID-after.json") | ||
| git log --oneline "${BASE_AFTER}..HEAD" > "/tmp/pr-$PR-$RUN_ID-unique-commits-after.txt" | ||
| git diff --stat "${BASE_AFTER}...HEAD" > "/tmp/pr-$PR-$RUN_ID-diff-after.txt" | ||
| ``` | ||
|
|
||
| ## Enforcement after the action | ||
|
|
||
| ```text | ||
| If PR state != OPEN: | ||
| stop and recover with successor PR. | ||
|
|
||
| If unique commits == 0 and diff == empty: | ||
| do not force-push again; classify as merged/covered/collapsed. | ||
|
|
||
| If unique commits or diff still exist but PR is closed: | ||
| open successor PR and record old→new mapping. | ||
| ``` | ||
|
|
||
| ## Recovery-note format (when opening a successor PR) | ||
|
|
||
| ```text | ||
| old PR: #<num> | ||
| new PR: #<num> | ||
| branch: <name> | ||
| before head SHA: <sha> | ||
| after head SHA: <sha> | ||
| base SHA: <sha> | ||
| diff-stat proving remaining content: <output> | ||
| seconds_between_force_push_and_pr_close: <int> | ||
| whether original later became merged/covered: <yes/no/n-a> | ||
| reason reopen failed (if applicable): <message> | ||
| ``` | ||
|
|
||
| The `seconds_between_force_push_and_pr_close` field | ||
| (Claude.ai's catch) lets future incidents cluster against this | ||
| one. Sub-five-second close = almost certainly platform race; | ||
| spread across minutes = different mechanism. | ||
|
|
||
| ## Successor-PR dedup (Deepseek's catch) | ||
|
|
||
| GitHub's eventual consistency means an auto-closed PR may | ||
| later be marked as merged once the comparison/diff state | ||
| settles. After opening the successor: | ||
|
|
||
| ```text | ||
| After opening successor PR: | ||
| - re-check original PR state after GitHub settles (~60s+) | ||
| - if original later became merged/covered AND successor | ||
| content is identical → close successor as duplicate | ||
| (preserves attribution lineage) | ||
| - if content has diverged → keep successor and record | ||
| why in the recovery note | ||
| - always record old→new PR mapping in a recovery-log file | ||
| for future incident clustering | ||
| ``` | ||
|
|
||
| Otherwise the queue accumulates phantom successor PRs. | ||
|
|
||
| ## Why P3 (research-grade, not blocking) | ||
|
|
||
| The trap was caught and recovered (PR #811 successor opened | ||
| within minutes). The mechanical guard would prevent future | ||
| recurrence but isn't blocking. Promote when active drain is | ||
| clear AND the same trap recurs (composition signal). | ||
|
|
||
| ## Composes with | ||
|
|
||
| - B-0103 (computed-metadata-discipline) — same family of | ||
| agent-asserted-state vs derived-truth checks. | ||
| - The auto-merge fix in PR #811 + #814 — the safer alternative | ||
| to manual rebase + force-push during cascade. | ||
| - `memory/feedback_outdated_review_threads_block_merge_resolve_explicitly_after_force_push_2026_04_27.md` | ||
| — sibling force-push-affects-PR rule. | ||
Uh oh!
There was an error while loading. Please reload this page.