Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 107 additions & 3 deletions .claude/rules/claim-acquire-before-worktree-work.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ Composes with [B-0530](../../docs/backlog/P3/B-0530-cron-sentinel-mutex-prevent-
when it ships); until that ships, the borrow pattern is the
operational workaround.

## Saturation-ceiling — 4 failure sub-cases of borrow-on-existing
## Saturation-ceiling — 5 failure sub-cases of borrow-on-existing

Empirical anchor [PR #3808](https://github.com/Lucent-Financial-Group/Zeta/pull/3808)
(closed-without-merge; shard for `0715Z` was the PR's payload, hence never
Expand All @@ -163,8 +163,10 @@ fresh-cold-boot Otto-CLI, and peer-agent global-lock-cleanup loop), with
peer Otto cycling worktree HEAD every ~3-5 min for 9 transitions in
35 min, a fresh-cold-boot session attempting to ship a shard hit FOUR
distinct failure sub-cases of the borrow-on-existing pattern across 4
commit attempts. All 4 sub-cases empirically validated; only 2 have
working mitigations today.
commit attempts. All 4 of those sub-cases empirically validated; only 2
have working mitigations today. **A fifth sub-case (3b — pack-dir
contention at push time) was added in 2026-05-21 (PR [#4536](https://github.com/Lucent-Financial-Group/Zeta/pull/4536)) with a working mitigation
(REST git-data API bypass).**
Comment on lines 156 to +169

### Sub-case 1 — existing-branch-name collision → peer-WIP commit inheritance via recovery path

Expand Down Expand Up @@ -230,6 +232,108 @@ activity. No `--lock` flag prevents this; see [B-0530](../../docs/backlog/P3/B-0
mutex (not yet shipped). Until then, fall through to existing-sidetick
borrow — which hits sub-case 4.

### Sub-case 3b — pack-dir contention causes `git push` to fail at push time

Same B-0530 root cause class as sub-case 3, but manifesting at `git push`
time on an already-created worktree that previously passed the canary.
Comment thread
AceHack marked this conversation as resolved.
Distinguished from B-0615 (silent-push-failure) by being non-silent.

**Symptom**: `git push` returns non-zero exit with errors like:

```
error: unable to open loose object <sha>: Interrupted system call
error: unable to open object pack directory: .../.git/objects/pack: Interrupted system call
fatal: bad object <sha>
fatal: the remote end hung up unexpectedly
error: failed to push some refs to '...'
```

Network + auth are fine; bottleneck is local pack-dir reads under peer-agent
contention. Distinguish from **B-0615** (push exits ZERO but remote ref
never updates — silent; mitigation: REST git-data API bypass per
[PR #4145](https://github.com/Lucent-Financial-Group/Zeta/pull/4145)).
Both belong to the same FS-contention root cause class but require
different mitigations because the exit codes differ.

**Mitigation (working today)**: the **B-0615 REST git-data API bypass**
(`POST .../git/blobs` → `POST .../git/trees` → `POST .../git/commits` →
`POST/PATCH .../git/refs`) works for sub-case 3b as well as B-0615.
Empirical anchor: [PR #4535](https://github.com/Lucent-Financial-Group/Zeta/pull/4535)
(2026-05-21) — the memo about this very failure mode was blocked from
landing by repeated `timeout`-wrapped `git push` runs surfacing exit 124
(GNU `timeout`'s "command killed by timeout" status — NOT a native
`git push` exit code; the contention was hanging the push indefinitely
until the wrapper killed it). The same commits then shipped successfully
via the REST bypass.

**Cost**: ~5-6 REST calls total per commit, consuming the **REST/core
budget** (5000/hr per token; check via `gh api rate_limit --jq
'.resources.core'`). REST/core is independent of the GraphQL budget
discussed in [`refresh-world-model-poll-pr-gate.md`](refresh-world-model-poll-pr-gate.md);
Comment on lines +269 to +272
the tier classification in that rule (Normal / Cost-aware / Extreme /
Pure-git) is GraphQL-budget-scoped and does NOT translate directly to
REST/core. Empirically: even at GraphQL Extreme cost-aware tier (200–1000
remaining), REST/core typically has thousands remaining and the bypass
is affordable. No `.git/objects/pack` reads happen locally because
GitHub does the object packing server-side from the blob you uploaded.

**Composes with the rate-limit operational tiers** documented in
[`refresh-world-model-poll-pr-gate.md`](refresh-world-model-poll-pr-gate.md):
when the saturation makes `git push` exit non-zero or hang, the REST
bypass IS the tier-skipping move that lets substantive substrate land
without waiting for contention to clear.

### In-place index recovery — `git read-tree HEAD`

Refinement to sub-case 5 (peer-side destructive git operation), where the
specific symptom is a **truncated index file** after stale-lock removal:

```
fatal: .git/worktrees/<name>/index: index file smaller than expected
```

A preceding `git status` may show massive D (deleted) entries against
files you have not touched — a misleading symptom of the corrupted index,
NOT actual working-tree deletion. Do NOT abandon the worktree on this
symptom alone; first verify the working tree itself via `ls` (files
should still be on disk).

**Recovery**:

```bash
git -C <worktree> read-tree HEAD
```

This rebuilds the worktree's index from the HEAD commit, replacing the
truncated index in-place. Working-tree files are NOT modified (they were
not part of the corruption — only the index was). After rebuild:

1. `git status` now reflects the genuine working-tree-vs-HEAD diff —
not "empty," because `read-tree` only rewrote the index, not the
working tree. Any intended local edits / untracked files you had
before the corruption STILL show as modified / untracked. The
`index file smaller than expected` error is gone; that is the
indicator the recovery worked. (Misreading `read-tree` as "should
produce a clean status" is the most common way the recovery gets
misdiagnosed as failed when it actually succeeded.)
2. Stage your intended file via `git add <path>` — the file is still
on disk; `read-tree` wiped any stale staged state but did not touch
the working tree
3. `git commit` normally
4. Verify commit canary (parent tree size = commit tree size) before
pushing per [`codeql-no-source-on-docs-only-pr-is-broken-commit-canary.md`](codeql-no-source-on-docs-only-pr-is-broken-commit-canary.md)

**When NOT to use**: if the working tree itself is corrupted (files
missing on disk), `read-tree` will silently stage the wrong state.
Pre-check disk state via `ls` before invoking. This recovery applies
ONLY to truncated-INDEX states, NOT truncated-working-tree states.

**Empirical anchor**: [PR #4532](https://github.com/Lucent-Financial-Group/Zeta/pull/4532)
(2026-05-21) — the 1212Z tick shard was successfully shipped after
`read-tree HEAD` recovered an index truncated by stale-lock-removal
race; previously the saturation-ceiling rule's only recovery option
was worktree abandonment.

### Sub-case 4 — pruned-sidetick race

The empirically-validated sidetick `/private/tmp/zeta-otto-cli-0027z-sidetick`
Expand Down
Loading