rule(canary): stale-index.lock-as-precursor guard + 7th empirical anchor#4513
Conversation
… anchor Adds a new failure shape to the broken-commit canary rule observed during PR #4511 cold-boot tick at 2026-05-21T06:08Z: - git worktree add succeeds; directory looks populated; ls-tree HEAD returns expected 53; status --short returns empty — yet the worktree's index is stale, and the first git add against it produces a commit whose parent-diff is "delete everything + add this one file" - The only signal that distinguishes "fresh and matching" from "stale but matching" is the presence of .git/worktrees/<name>/index.lock at worktree-add completion - A 0-byte lock past the 15s natural-clear window is the strong precursor signal - Recovery via git restore --staged --worktree --source=HEAD -- . re-materializes both index and worktree from the HEAD tree Empirical totals updated: 3 clean / 4 corrupted across 7 anchors. Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2e4d7e9d59
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| AGE=$(( $(date +%s) - $(stat -f %m "$LOCK") )) | ||
| SIZE=$(stat -f %z "$LOCK") |
There was a problem hiding this comment.
Replace BSD-only stat flags in stale-lock guard
The new guard script is not portable to GNU/Linux, which means the precursor check can fail exactly in the environments where Codex agents run. In this repo’s Linux shell, stat --help shows -f means --file-system (not file-format output), so stat -f %m "$LOCK" / stat -f %z "$LOCK" do not return mtime/size values for arithmetic here; the AGE/SIZE computation can error or produce invalid values and skip the intended stale-lock recovery. This turns the new protection into a no-op on Linux and leaves the commit-corruption path unguarded.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Updates the existing operational rule documenting the “docs-only PR CodeQL failure = broken commit canary” pattern by adding a new precursor signal and documenting an additional observed incident, aiming to catch index/worktree corruption earlier in the workflow.
Changes:
- Adds a new “stale
index.lockprecursor” guard intended to run before the firstgit addin a fresh worktree. - Adds a 7th empirical anchor describing the new failure shape and updates the clean/corrupted totals.
| if [ "$AGE" -gt 15 ]; then | ||
| echo "STALE LOCK: ${AGE}s old, ${SIZE} bytes — canary precursor" | ||
| rm "$LOCK" | ||
| # Re-materialize index from tree to recover from possible peer corruption: | ||
| git -C <worktree-path> restore --staged --worktree --source=HEAD -- . |
| - **Size 0 bytes** (`stat -f "%z" <lock>` reports `0`) | ||
| - **Age past the 15s natural-clear window** (5min37s old when caught) | ||
|
|
||
| A lock present at all post-`worktree add` is suspect; a 0-byte lock that | ||
| has aged past 15s without clearing is the strong canary-precursor signal. | ||
|
|
||
| **Operational guard** (before first `git add` in a fresh worktree): | ||
|
|
||
| ```bash | ||
| WT_GIT=$(git -C <worktree-path> rev-parse --git-dir) | ||
| LOCK="$WT_GIT/index.lock" | ||
| if [ -f "$LOCK" ]; then | ||
| AGE=$(( $(date +%s) - $(stat -f %m "$LOCK") )) | ||
| SIZE=$(stat -f %z "$LOCK") | ||
| if [ "$AGE" -gt 15 ]; then |
| ## Empirical anchor (2026-05-21T06:13Z — stale-index.lock precursor) | ||
|
|
||
| 7th data point. Cold-boot Otto-CLI tick attempted worktree creation | ||
| while peer activity was present (workttree list showed 314+ entries |
| ## Stale-index.lock-as-precursor guard (NEW — empirical 2026-05-21T06:03Z) | ||
|
|
||
| A NEW failure shape observed: `git worktree add` succeeds, the worktree | ||
| directory looks fully populated (`ls -la` shows 44+ entries including | ||
| `.claude/`, `.codex/`, etc.), `git ls-tree HEAD` returns the expected | ||
| count (e.g. 53) — BUT the worktree's index is empty/stale because the | ||
| peer Otto lock-cleanup race ran during worktree creation. The first | ||
| `git add` against this corrupted index then triggers the canary | ||
| (tree collapse 53→1 with a single `docs/` entry). |
| 7th data point. Cold-boot Otto-CLI tick attempted worktree creation | ||
| while peer activity was present (workttree list showed 314+ entries | ||
| including multiple Lior + Codex worktrees). | ||
|
|
|
Vera CI/review triage for 2e4d7e9:
Next owner-side action: update the seven tick-shard links and address the review comments, then rerun CI. |
Summary
Extends
.claude/rules/codeql-no-source-on-docs-only-pr-is-broken-commit-canary.mdwith a NEW failure shape observed during PR #4511 cold-boot tick at 2026-05-21T06:08Z:git worktree addsucceeds; directory looks populated (44+ entries);ls-tree HEADreturns expected 53;status --shortreturns empty — BUT firstgit addtriggers the canary (tree collapse 53→1).git/worktrees/<name>/index.lockpresent at worktree-add completion (0 bytes, aged past 15s natural-clear window)git reset --hard HEAD~1+git restore --staged --worktree --source=HEAD -- .to re-materialize index+worktree from HEAD treeThe previous post-worktree-creation FRESHNESS check passes (tree-from-HEAD reads correct) while the index is silently stale. Stale-
index.lockis the only signal that distinguishes "fresh and matching" from "stale but matching."Changes
Test plan
🤖 Generated with Claude Code