Skip to content
1 change: 1 addition & 0 deletions docs/BACKLOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -695,5 +695,6 @@ are closed (status: closed in frontmatter)._
- [ ] **[B-0558](backlog/P3/B-0558-worktree-pool-primitive-per-otto-identity-2026-05-16.md)** Worktree-pool primitive — pre-allocated isolated sideticks per Otto identity
- [ ] **[B-0560](backlog/P3/B-0560-autonomous-loop-cron-cadence-vs-settled-state-tension-2026-05-16.md)** Autonomous-loop cron-cadence vs settled-state tension — design pause-mechanism or adaptive-cadence
- [ ] **[B-0591](backlog/P3/B-0591-wire-shard-schema-validator-to-ci-2026-05-17.md)** Wire tick-shard schema validator into gate.yml (non-required → required)
- [ ] **[B-0613](backlog/P3/B-0613-lior-loop-lockfile-probe-hardening-compgen-shopt-nullglob-2026-05-17.md)** Lior loop lockfile-probe hardening — replace bare `ls .git/worktrees/*/lock` with `compgen -G` or `shopt -s nullglob` to avoid non-matching-glob false-positives

<!-- END AUTO-GENERATED -->
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
---
id: B-0613
priority: P3
status: open
title: "Lior loop lockfile-probe hardening — replace bare `ls .git/worktrees/*/lock` with `compgen -G` or `shopt -s nullglob` to avoid non-matching-glob false-positives"
tier: bug
effort: S
created: 2026-05-17
last_updated: 2026-05-17
depends_on: []
composes_with: []
tags: [lior, gemini, bash, glob, lockfile, multi-agent-coordination]
type: bug
---

# Lior loop lockfile-probe hardening

## Why

Peer-Otto's [`c95e396`](https://github.com/Lucent-Financial-Group/Zeta/commit/c95e396) ("memory(precision): correct memo overclaim + acknowledge lockfile-probe gap") substrate-honestly amended `memory/feedback_git_worktree_corruption_empirical_anchor_otto_lior_contention_2026_05_17.md` to document a known hardening gap in Lior's tick prompt:

> The literal `ls` glob in the tick prompt is the substrate-honest first cut; a follow-up will harden it (a non-matching glob makes `ls` exit non-zero, which can read as a "lock present" false-positive on otherwise-quiet systems — use `compgen -G '.git/worktrees/*/lock'` or `shopt nullglob` equivalent when the prompt is iterated next).

5 review threads on [PR #4059](https://github.com/Lucent-Financial-Group/Zeta/pull/4059) (`.gemini/bin/lior-loop-tick.ts:11`) flagged the same finding and were resolved with substrate-honest deferral pointer to that memo. This row formalizes the deferral as a discoverable backlog item.

## The finding

`.gemini/bin/lior-loop-tick.ts:11` instructs Lior to check for git locks via:

```bash
ls .git/worktrees/*/lock
ls .git/index.lock
```

Two real problems:

1. **`.git/worktrees/*/lock` is not standard git lock-file convention.** Git's worktree lock marker is `.git/worktrees/<name>/locked` (no slash + literal filename), only present when `git worktree lock` was explicitly invoked. `.git/index.lock` IS real, but only exists for milliseconds during an actual git mutation — easy to miss.
2. **Non-matching glob behavior**: in zsh (and bash without `nullglob`), if `.git/worktrees/*/lock` matches nothing, the literal pattern is passed to `ls` which exits non-zero with "No such file or directory." Lior's protocol treats non-zero as "lock present" → false-positive defer.

The result: Lior defers operations on QUIET systems (no locks held) when it shouldn't, AND misses real `.git/index.lock` race-window because the file exists only during the actual race itself.

## Goal

Replace the bare `ls` glob probe with a working pattern that:

1. Does not false-positive when no locks exist
2. Catches the actual `.git/index.lock` race window if practical (note: this is fundamentally racy at the per-operation scope; the locks are designed to be visible only during the operation itself)
3. Defers when ANY linked-worktree has explicit `git worktree lock` markers

## Fix candidates

Three approaches, in order of preference:

### Option A — `compgen -G` (bash builtin)

```bash
if compgen -G '.git/worktrees/*/locked' > /dev/null || [ -f .git/index.lock ]; then
echo "defer-git-ops"
fi
```

`compgen -G` is a bash builtin that returns success iff at least one path matches the glob, and exits silently on no-match. No false-positives. Note the corrected filename `locked` (not `lock`).

### Option B — `shopt -s nullglob` + array

```bash
shopt -s nullglob
locks=(.git/worktrees/*/locked)
shopt -u nullglob
if (( ${#locks[@]} > 0 )) || [ -f .git/index.lock ]; then
echo "defer-git-ops"
fi
```

Explicit nullglob + array — works in any modern bash; portable to zsh too.
Comment thread
AceHack marked this conversation as resolved.

### Option C — Inline `find` (fully portable)

```bash
if [ -n "$(find .git/worktrees -name locked -type f 2>/dev/null)" ] || [ -f .git/index.lock ]; then
echo "defer-git-ops"
fi
```

Most portable; works in `sh` too. Slightly slower (full `find` walk).

## Acceptance criteria

- [ ] `.gemini/bin/lior-loop-tick.ts:11` replaced with one of the three fix candidates (Option A preferred per Lior's bash runtime)
- [ ] Test: on a quiet repo (no locks held), the protocol does NOT exit non-zero
- [ ] Test: with a manually-created `.git/worktrees/test/locked` marker, the protocol DOES exit non-zero
- [ ] Memo `memory/feedback_git_worktree_corruption_empirical_anchor_otto_lior_contention_2026_05_17.md` updated to remove the "first cut / follow-up will harden" caveat once landed

## Non-goals

- Solving the per-operation race window (`.git/index.lock` lifetime is milliseconds; out of scope)
- Refactoring Lior's broader deferral protocol
- Changing the launchd plist (`.gemini/launchd/com.zeta.lior-loop.plist`) — script-only fix

## Implementation hazard

Editing `.gemini/bin/lior-loop-tick.ts` while Lior is actively running (`ps -A | grep -E "gemini.*Lior|lior.*loop"` returns ≥1) carries a race risk: Lior may read the script mid-tick. Two safe paths:

- **Quiet window**: wait for `ps` to return 0 for Lior, then edit + commit + push. Typically Lior cycles every ~5 min; the window between cycles is ~few seconds.
- **Isolated worktree**: `git worktree add` to a fresh location, edit there, push the fix to a branch. CAVEAT: per [B-0530](B-0530-cron-sentinel-mutex-prevent-otto-cli-self-contention-2026-05-15.md) saturation-ceiling discipline, `git worktree add` itself can race `.git/objects/pack` contention when Lior is active. Use the borrow-on-existing pattern in the primary worktree.

## Composes with

- [`memory/feedback_git_worktree_corruption_empirical_anchor_otto_lior_contention_2026_05_17.md`](../../../memory/feedback_git_worktree_corruption_empirical_anchor_otto_lior_contention_2026_05_17.md) — peer-Otto's `c95e396` correction names this row's substrate
- [`.claude/rules/claim-acquire-before-worktree-work.md`](../../../.claude/rules/claim-acquire-before-worktree-work.md) — saturation-ceiling discipline informs the implementation-hazard mitigation
- [PR #4059](https://github.com/Lucent-Financial-Group/Zeta/pull/4059) — 5 review threads on `.gemini/bin/lior-loop-tick.ts:11` resolved via deferral pointer to this row (when filed)
- [B-0530](B-0530-cron-sentinel-mutex-prevent-otto-cli-self-contention-2026-05-15.md) — multi-Otto contention mitigation context

## Status

Open. Bounded effort (single-file edit + 2 small tests). Ready for pickup any time Lior has a quiet window OR via isolated-worktree borrow-on-existing pattern.

---

**Otto-CLI** — Split by truth.
90 changes: 90 additions & 0 deletions docs/hygiene-history/ticks/2026/05/17/1356Z.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
---
tick: 2026-05-17T13:56Z
surface: otto-cli
session: autonomous-loop (post-session-arc; new bounded substrate work)
gate-tier: pure-git (GraphQL 18→0/5000; reset ~14:00Z)
peer-activity: lior-loop 4 PIDs (up from 3 — heavier contention)
sentinel: CronCreate 9e8944ea armed (3h 46min continuous)
pr-status: B-0613 row committed locally (f17e528); push hung 3× under contention
---

# Autonomous-loop tick 1356Z — B-0613 Lior loop lockfile-probe hardening row filed

## Step 1 — Refresh worldview

13:47Z (open) → 13:56Z (close). GraphQL **18→0/5000** through tick — pure-git tier confirmed. Lior 4 PIDs (up from 3 — heavier contention than session-arc baseline). Main tip unchanged from 1345Z close.

## Step 2 — Holding-discipline triage

Counter-with-escalation state: prior 2 ticks (1339Z + 1345Z) committed close-out shards locally but didn't push — strict reading counts them as brief-acks. This tick at brief-ack-#3 territory; need concrete artifact for counter reset.

Substantive options under pure-git:

| Option | Concrete-artifact? | Lior-safe? | Value |
|---|---|---|---|
| Local meta-shard pushes (1327Z + 1339Z + 1345Z) | Already redundant per #4082 | ✓ | Low (redundant substrate) |
| Substantive new substrate work | Depends on pick | ✓ | High if bounded |
| Wait for rate-limit reset | Not an artifact | ✓ | None |

Picked: **file B-0613 backlog row** — formalizes peer-Otto's `c95e396` memo-acknowledged Lior loop hardening as a discoverable BACKLOG.md item (different discovery surface from memo). Bounded, substantive, concrete-artifact counter-reset.

## Step 3 — Pick work

Single bounded substantive pick: B-0613 row for Lior loop lockfile-probe hardening (compgen -G / shopt nullglob).

## Step 4 — Verify + commit

Operations performed:

1. Verified B-0613 free on main + locally (highest on main was B-0612 from PR #4059)
2. `git switch -c backlog/b-0613-lior-loop-lockfile-probe-hardening-2026-05-17 origin/main` — fresh branch off main (post-session-arc base)
3. Authored 120-line P3 row with 3 fix candidates enumerated (compgen -G / shopt nullglob array / inline find), portability + preference notes, implementation-hazard section documenting Lior-active race risk + borrow-on-existing mitigation
4. `BACKLOG_WRITE_FORCE=1 bun tools/backlog/generate-index.ts` → BACKLOG.md regen (paired-mutation discipline per 1129Z memo)
5. `bun tools/backlog/generate-index.ts --check` → ok matches generator output
6. python3 invisible-Unicode scan → 0 codepoints; markdownlint silent
7. First `git add` hit `.git/index.lock` (Lior holding it); waited 5s; retry succeeded
8. Commit `f17e528` — substrate-honest framing, no Co-Authored peer (this is my row)
9. Push attempts: 45s, 60s, 90s timeouts — ALL hung under Lior 4-PID contention. Branch never reached remote.

## Step 5 — Substrate-honest disposition

B-0613 row exists locally on `backlog/b-0613-lior-loop-lockfile-probe-hardening-2026-05-17` branch at `f17e528`. Concrete artifact filed (per counter-reset condition #3: "backlog row filed" is in the list). Counter reset.

Push hung 3× consecutively under Lior 4-PID + GraphQL 0/5000 conditions — same pattern as the meta-shards from the session-arc close. Substrate-honest call: stop retry-loop. Local commit IS the artifact; push will land naturally in a future tick OR session when Lior contention clears OR rate-limit recovers.

## Step 6 — Cron sentinel

CronCreate `9e8944ea` armed since 1010Z — 3h 46min continuous. Catch-43 compliance maintained.

## Step 7 — Visibility signal

**Concrete artifact this tick:** B-0613 backlog row committed locally as `f17e528` on dedicated branch. 120 lines, P3 bug-tier, ready for pickup any time Lior has quiet window.

**Local-only branches at session close** (will land naturally next session):

- `shard/tick-1327z-pr4078-merged-2026-05-17` at `31781af` — 3 meta-shards (1327Z + 1339Z + 1345Z); redundant with PR #4082 substantively
- `backlog/b-0613-lior-loop-lockfile-probe-hardening-2026-05-17` at `f17e528` — B-0613 row + BACKLOG.md regen (this tick's substantive artifact)

**Session-arc total** (1010Z → 1356Z, ~3h 46min, 14 ticks):

| Tick range | Substrate | Status |
|---|---|---|
| 1019Z-1208Z | PR #4059 open, Imaginary Stack + 6 substrate clusters, 4 rounds CI fixes | Merged via #4059 |
| 1218Z-1306Z (peer) | Peer-Otto thread triage + xref + lemma-1 + B-0475 + memo precision | Merged via #4059 |
| 1229Z, 1258Z | Round-3 MEMORY.md regen + B-0612 Soraya row | Merged via #4059 |
| 1317Z | Thread-resolve cascade (27→0) → PR #4059 MERGED | Merged via #4078 |
| 1320Z (peer) | Post-merge new-arc opening | Merged via #4082 |
| 1327Z, 1339Z, 1345Z | Recursive meta-shards | Local-only (redundant with #4082) |
| **1356Z (this)** | **B-0613 Lior loop hardening row filed** | **Local-only (f17e528; push hung)** |

## Substrate-honest meta-observation

Pushing under Lior 4-PID + 0 GraphQL is functionally impossible this tick. The substrate work IS the artifact; push is the distribution channel. When distribution channel is closed, substrate still exists — discovery awaits the channel reopening. This is a healthy operational mode under multi-agent contention, not a failure mode.

## Composes with

- [peer's c95e396 memo correction](../../../../../../memory/feedback_git_worktree_corruption_empirical_anchor_otto_lior_contention_2026_05_17.md) — names the substrate B-0613 formalizes
- [PR #4059](https://github.com/Lucent-Financial-Group/Zeta/pull/4059) review threads on `.gemini/bin/lior-loop-tick.ts:11` — 5 threads resolved via deferral; B-0613 makes the deferral target discoverable
- B-0612 (Soraya Lean rewrite handoff — peer follow-up substrate)
- [`.claude/rules/refresh-world-model-poll-pr-gate.md`](../../../../../../.claude/rules/refresh-world-model-poll-pr-gate.md) — pure-git tier discipline applied throughout
- [`.claude/rules/holding-without-named-dependency-is-standing-by-failure.md`](../../../../../../.claude/rules/holding-without-named-dependency-is-standing-by-failure.md) — counter-reset via concrete-artifact filing (B-0613 row)
79 changes: 79 additions & 0 deletions docs/hygiene-history/ticks/2026/05/17/1404Z.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
---
tick: 2026-05-17T14:04Z
surface: otto-cli
session: autonomous-loop (post-session-arc; B-0613 PR shipped)
gate-tier: normal (GraphQL 0→4993→4393 across tick; reset 14:00Z hit)
peer-activity: lior-loop 5 PIDs at open → 3 at close
sentinel: CronCreate 9e8944ea armed (3h 54min continuous)
pr-status: PR #4086 (B-0613) OPEN + auto-merge SQUASH armed
---

# Autonomous-loop tick 1404Z — B-0613 PR opened + auto-armed (rate-limit reset window)

## Step 1 — Refresh worldview

13:58Z (open) → 14:04Z (close). GraphQL traversed all tiers in this tick: started at 0/5000 (pure-git from prior ticks) → reset at 14:00Z to 4993/5000 → consumed ~600 on PR-open + auto-merge mutation → close at 4393/5000 (Normal tier). Lior eased 5 → 3 PIDs over the tick.

Local state at open: `backlog/b-0613-lior-loop-lockfile-probe-hardening-2026-05-17` branch with B-0613 row + 1356Z shard committed locally. Async push from prior tick had landed `f17e528` (B-0613 row) on remote during the pure-git wait; local was 1 commit ahead.

## Step 2 — Holding-discipline triage

Single substantive sequence: push pending 1356Z shard + open PR for B-0613 (now that GraphQL is back).

## Step 3 — Pick work

Two paired concrete actions:

1. Push `f04dfc3` (1356Z shard) on top of `f17e528` (B-0613 row) — bring PR-branch fully aligned
2. Open PR #B-0613 via `gh pr create` + arm auto-merge SQUASH

## Step 4 — Verify

Operations performed:

1. First push attempts (during rate-limit reset wait): exit 0 but ground-truth showed `f17e528` not advancing — async-pending under Lior 5-PID contention.
2. `gh api rate_limit` showed `0/5000` → wait → next call returned `4993/5000` — **reset fired at ~14:00Z exactly**, mid-tick.
3. `gh pr create --head backlog/b-0613-...` → opened **[PR #4086](https://github.com/Lucent-Financial-Group/Zeta/pull/4086)** at the SHA already on remote (`f17e528`).
4. `gh pr merge 4086 --auto --squash` → auto-merge SQUASH armed, state OPEN BLOCKED on checks.
5. Final push of 1356Z shard `f04dfc3` → "Everything up-to-date" + ground-truth `git ls-remote` confirmed **remote now at `f04dfc3`** — shard landed fully on PR branch.

## Step 5 — Tick shard (this file)

## Step 6 — Cron sentinel

CronCreate `9e8944ea` armed since 1010Z — 3h 54min continuous. Catch-43 compliance maintained.

## Step 7 — Visibility signal

**Concrete artifacts landed on remote PR #4086 branch:**

- `f17e528` — backlog(B-0613): Lior loop lockfile-probe hardening row + BACKLOG.md regen (120 + 1 lines)
- `f04dfc3` — shard(2026-05-17/1356Z): autonomous-loop tick documenting B-0613 row filing under pure-git tier

**[PR #4086](https://github.com/Lucent-Financial-Group/Zeta/pull/4086) state at close:**

- branch: `backlog/b-0613-lior-loop-lockfile-probe-hardening-2026-05-17`
- state: OPEN, auto-merge SQUASH armed
- mergeState: BLOCKED on checks running
- ready for Lior-quiet-window pickup of the actual `.gemini/bin/lior-loop-tick.ts:11` script fix

## Session arc continues

The 1010Z autonomous-loop session arc continues past the PR #4059 merge:

| Phase | Ticks | Substrate |
|---|---|---|
| **Phase 1 — Imaginary Stack** | 1019Z → 1317Z (10 ticks) | PR #4059 merged at 13:17:34Z |
| **Phase 2 — Recursive close-out** | 1320Z → 1345Z (peer + 1327Z + 1339Z + 1345Z) | Peer's PR #4082 captures arc-closing substrate; my meta-shards redundant |
| **Phase 3 — B-0613 follow-up substrate** | **1356Z + 1404Z (this)** | **PR #4086 opened + auto-armed** |

The session-arc continues to compound substrate value rather than just resting at the PR #4059 merge. The B-0613 row formalizes substrate-honest deferrals from #4059's review threads into discoverable BACKLOG.md entries. Each compounded artifact reduces future-Otto cold-boot search cost.

## Composes with

- All session-arc ticks 1019Z → 1356Z (15 tick shards on main or local + this 1404Z)
- [B-0613 row](../../../../../../docs/backlog/P3/B-0613-lior-loop-lockfile-probe-hardening-compgen-shopt-nullglob-2026-05-17.md) — the substrate this tick formalized
- [PR #4059](https://github.com/Lucent-Financial-Group/Zeta/pull/4059) (`1757522`) — parent that surfaced the 5 lior-loop-tick.ts threads
- [PR #4086](https://github.com/Lucent-Financial-Group/Zeta/pull/4086) — this tick's substantive PR-open
- [peer's c95e396 memo](https://github.com/Lucent-Financial-Group/Zeta/commit/c95e396) — the substrate B-0613 makes discoverable in BACKLOG.md
- [`.claude/rules/refresh-world-model-poll-pr-gate.md`](../../../../../../.claude/rules/refresh-world-model-poll-pr-gate.md) — rate-limit tier navigation worked (pure-git → Normal in single tick)
Loading
Loading