Skip to content

feat(B-0894): reboot-survival worktree-location discipline — flip default from /private/tmp/ to ~/Documents/src/repos/Zeta/worktrees/#5696

Merged
AceHack merged 1 commit into
mainfrom
otto-cli/b-0894-reboot-survival-worktree-location-2026-05-28
May 28, 2026
Merged

feat(B-0894): reboot-survival worktree-location discipline — flip default from /private/tmp/ to ~/Documents/src/repos/Zeta/worktrees/#5696
AceHack merged 1 commit into
mainfrom
otto-cli/b-0894-reboot-survival-worktree-location-2026-05-28

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 28, 2026

Operator-named problem

"why are we putting any git stuff in /private/tmp/ this is terrible design"
"we need to survive reboots in any kind of inflight stuff"
— operator 2026-05-28T~04:30Z UTC

Empirical anchor (same restart that triggered the operator's critique)

Worktree location Outcome on operator's 2026-05-28 restart
/private/tmp/zeta-* (95 instances, per prior rule recommendation) All 95 prunedgit worktree list returned prunable; on-disk dirs gone
~/Documents/src/repos/Zeta/worktrees/<surface>-* (multiple, Lior + this PR's worktree) All survived intact
~/.gemini/tmp/project/lior-* (multiple) All survived intact (user-home is persistent)

The 04:09Z autonomous-loop tick had a substantive commit 4f89af885 on otto-cli/tick-0409z-sentinel-rearm-2026-05-28 with a backgrounded git push in flight when restart hit. Push never completed; branch ref + commit object survived in .git/objects/, but the worktree dir at /private/tmp/zeta-otto-cli-0409z-sentinel-rearm/ was gone post-restart. The /private/tmp/claude-501/<harness-id>/tasks/<task-id>.output background-task output file was also cleared, so couldn't even read whether push completed.

Dogfooding-proof: the worktree authoring this PR (~/Documents/src/repos/Zeta/worktrees/otto-cli-reboot-survival-fix-0434z/) survived the same restart cleanly. Used the new pattern to land the new pattern.

What this PR delivers

  1. New backlog row B-0894 (P1) with 5-criteria acceptance + empirical anchor + composes_with B-0750 / B-0530 / B-0858.5
  2. agent-worktree-hygiene rule update:
    • Carved sentence: includes reboot-survival as load-bearing invariant
    • Rule 2: default location flipped from /private/tmp/zeta-<task-tag>-<hhmmz>/ to ~/Documents/src/repos/Zeta/worktrees/<surface>-<task-tag>-<hhmmz>/ (per the empirically-validated Lior pattern)
    • New Rule 5: reboot-survival hard invariant + empirical-anchor table + operational discipline (commit immediately for /tmp/ migration; verify push outcome via git ls-remote not output-file)
    • Audit + verify-no-main-held commands updated to scan persistent-location surface

What this PR does NOT deliver (B-0894 sub-rows, follow-up scope)

  • Criterion 2 (rest of): claim-acquire rule references to /private/tmp/ are mostly historical empirical anchors (preserved) + a few pedagogical examples (small follow-up PR)
  • Criterion 3: bus envelope ZETA_BUS_DIR default migration to ~/.zeta-bus/ (filed as B-0894.1 work)
  • Criterion 4: harness-level background-task output is not agent-fixable; document the workaround
  • Criterion 5: per-agent persistent worktree-pool primitive (long-term mechanization; filed as B-0894.2 if shipped)

Composes with

  • B-0750 (agent worktree cleanup) — sibling at cleanup-discipline scope
  • B-0530 (cron-sentinel mutex) — sibling at multi-agent contention scope
  • B-0858.5 (heartbeat consent-first state-gather) — sibling at state-persistence scope
  • .claude/rules/tick-must-never-stop.md — sentinel session-exit is the harness-level companion to filesystem-level reboot-survival
  • All 5 existing Lior ~/Documents/src/repos/Zeta/worktrees/lior-* worktrees — empirical proof the pattern works

Test plan

  • B-0894 backlog row authored with substrate-inventory pass + empirical anchor + composes_with
  • agent-worktree-hygiene rule edited: carved sentence + Rule 2 + new Rule 5 + audit commands
  • Post-commit canary clean (ls-tree=61, matches origin/main)
  • Worktree authoring this PR survived the operator's restart (the proof-point this rule names)
  • Future autonomous-loop ticks create worktrees at ~/Documents/src/repos/Zeta/worktrees/<surface>-* per the new default (validation by observation)

🤖 Generated with Claude Code

…m /private/tmp/ to ~/Documents/src/repos/Zeta/worktrees/

Operator 2026-05-28: "why are we putting any git stuff in /private/tmp/
this is terrible design" + "we need to survive reboots in any kind of
inflight stuff". Empirical anchor from this exact restart: 95 worktrees
at /private/tmp/zeta-* pruned; multiple Lior worktrees at
~/Documents/src/repos/Zeta/worktrees/lior-* survived intact. Same
restart, opposite outcomes per where each agent placed its worktrees.

Changes:

- New backlog row B-0894 with empirical anchor + 5 acceptance criteria
  (this PR delivers criteria 1 + 2; criteria 3/4/5 are sub-row work)
- agent-worktree-hygiene rule: carved sentence updated to include
  reboot-survival as load-bearing invariant; Rule 2 default location
  flipped from /private/tmp/zeta-* to
  ~/Documents/src/repos/Zeta/worktrees/<surface>-<task-tag>-<hhmmz>/;
  new Rule 5 names reboot-survival as hard invariant with empirical
  anchor table; audit + verify-no-main-held commands updated to also
  scan the persistent-location surface.

The worktree authoring this PR is itself at the new persistent location
and survived this restart — dogfooding-proof for the fix.

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 28, 2026 04:37
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@AceHack AceHack enabled auto-merge (squash) May 28, 2026 04:37
@AceHack AceHack merged commit d3962a9 into main May 28, 2026
28 of 30 checks passed
@AceHack AceHack deleted the otto-cli/b-0894-reboot-survival-worktree-location-2026-05-28 branch May 28, 2026 04:39
AceHack added a commit that referenced this pull request May 28, 2026
… — canonical location ~/.zeta/agents/<persona>/<stream>/ (#5697)

Operator 2026-05-28 surfaced residual failure mode after B-0894 (PR
#5696) landed: "~/Documents/src/repos/Zeta/worktrees/lior-* this
sometimes locks up where i can't switch to main cause lior has it
locked to a worktree" + "~/Documents/src/repos/Zeta/ is for shared
up to date main and for me to push changes" + "per persona or even
per persona's parallel strems just for full isolation" + "or maybe
just .zeta/agents/".

B-0894 (PR #5696) correctly moved off /private/tmp/ (reboot-survival)
but placed the new default UNDER operator's primary repo, where
agent worktrees can still hold branch refs and block operator's
`git checkout`. B-0894.3 corrects to ~/.zeta/agents/<persona>/<stream>/
— outside operator's primary entirely; per-persona base; per-stream
isolation.

Changes:

- B-0894.3 backlog row (P1) with operator's verbatim correction +
  architecture table + Lior-migration-non-blocking framing
- agent-worktree-hygiene rule:
  - Carved sentence: two compounding invariants (reboot-survival
    + operator-primary-stays-agent-free)
  - Rule 1: example paths flipped to ~/.zeta/agents/<persona>/
  - Rule 2: default location flipped from ~/Documents/src/repos/Zeta/
    worktrees/<surface>-* to ~/.zeta/agents/<persona>/<stream-id>/;
    Lior migration documented as non-blocking future work
  - Rule 4: audit cmd scans ~/.zeta/agents/ + legacy surfaces
  - Rule 5: two compounding invariants + restored anchor table
    showing PR #5696's location triggered operator's blocking critique
    + new ~/.zeta/agents/ canonical location
  - Audit + verify-no-main-held commands scan all 4 surfaces
    (~/.zeta/agents/, legacy ~/Documents repo, /private/tmp, /tmp)

The worktree authoring this PR is itself at
~/.zeta/agents/otto-cli/b0894-3-per-persona-outside-repo-2026-05-28/
— first instance of the new canonical pattern. Operator primary at
~/Documents/src/repos/Zeta/ is unaffected (no git status pollution,
no operator-main-blocking risk).

Composes with: B-0894 (parent), B-0750 (cleanup), B-0751 (per-agent
clones), B-0894.1 (future ~/.zeta/bus/ migration).

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
@AceHack AceHack review requested due to automatic review settings May 28, 2026 04:59
AceHack added a commit that referenced this pull request May 28, 2026
…serve-choose skeleton parked + classifier-blocked cleanup (#5707)

* docs(hygiene-history): tick 0608Z — 10 merged-PR worktrees state + observe-choose skeleton parked + classifier-blocked cleanup

Fresh Otto-CLI cold-boot 4h after 0208Z shard. Catch-43 fired
(CronList empty); sentinel re-armed as 06b1e7d0.

Findings:
- 10 otto-cli worktrees all correspond to PRs merged today (04:39Z-05:48Z)
  covering #5696 through #5706 substrate-engineering cascade
- 9 are confirmed clean (status=0)
- 1 (observe-choose-skeleton-0512z) carries 2 substantive untracked
  files (observe.ts 363 + choose.ts 288 lines) authored per operator
  2026-05-28 directive
- PR #5700 LOCKED architecture: observe + choose --dry-run = simulate;
  move_next REMOVED. Skeleton is consistent on move_next removal but
  does NOT yet implement --dry-run = simulate semantics
- Substrate-honest: PARK skeleton pending architecture-alignment decision

Attempted mass-cleanup of 9 clean worktrees blocked by auto-mode
classifier (cited agent-worktree-hygiene rule + peer-WIP risk for
the one worktree in operator's primary checkout subdir). Per
classifier-bypass-research-do-not-deploy-without-zeta-safer-floor.md:
respect classifier as safety floor. Pivot to substrate-documentation.

Saturation clean (0 stuck git procs; 15 peer agent procs; GraphQL
Normal 4954/5000; REST 4904/5000). Isolated worktree off origin/main
(HEAD 140415f) with clean canary.

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(markdownlint): MD032 blanks-around-lists at 4 sites in 0608Z shard

Lines 60, 64, 73, 83 — list items needed surrounding blank lines.

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(shard-schema): drop header+separator rows so data row is first

Per Copilot review on PR #5707 + canonical schema requirement at
tools/hygiene/check-tick-history-shard-schema.ts COL1_RE: first
non-empty line MUST match the data row pattern, NOT a header row.

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant