fix(B-0402): shadow observer freshness-threshold guard (fix #3 — prevent new-console zsh abort)#3956
Conversation
…sh abort) 2026-05-16 bug: with the disabled production plist's flags (--dry-run --restore-arrow --loop-interval 2000), the shadow loop queried AX + pressed → in the frontmost terminal every 2s. On a NEWLY-OPENED terminal mid-zsh init, the AX queries + keypress could interfere with zsh interactive startup (Aaron experienced this and had to reboot to recover — "aborted my zsh user stuff"). Fix #3 (Aaron-picked option of three): pid-aware freshness threshold. Before any AX-heavy work in runOneCycle, query frontmost app pid+name via the new cheap get-frontmost-state.applescript; skip the cycle if that window has been frontmost for less than freshnessThresholdMs. Production plist sets --freshness-threshold-ms 5000 explicitly. CLI default remains 0 (disabled) so test fixtures + ad-hoc CLI invocations see deterministic detection on first cycle — the production safety choice is visible in the plist config rather than hidden in a CLI default. Changes: - get-frontmost-state.applescript: NEW (15 lines) — minimal pid+name query - shadow-observer.ts: +FrontmostState type, +getFrontmostState fn, +checkWindowFresh pure fn, +frontmostHistory Map, +_resetFrontmostHistory test helper, +freshnessThresholdMs config field, +CLI flag, +early-exit in runOneCycle with new "skipped-fresh-window" verdict - shadow-observer.test.ts: +freshnessThresholdMs:0 to existing fixtures, +4 new tests for freshness behavior (66 pass total, was 62) - launchd plist: +--freshness-threshold-ms 5000 in ProgramArguments, + plist comment block explaining the fix's rationale Aaron's directive: "go with (a) (shadow*) maybe you can do whatever now i now how to fix it if it's an issue" — option (a) was implement fix #3 before re-enabling. Re-enabling launchd is a separate follow-up — once this lands on main, run `bun tools/shadow/launchd/install-launchagent.ts` to reinstall the template with the new flag. Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 68d001e250
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a pid-aware “freshness threshold” guard to prevent AX queries + key injection from running immediately after a terminal window becomes frontmost (mitigating the reported zsh init abort), with a production-default enabled via launchd.
Changes:
- Introduces a cheap AppleScript to read frontmost app pid+name and a history-based freshness check to skip “too fresh” windows.
- Extends
runOneCyclewith an early “skipped-fresh-window” verdict and adds--freshness-threshold-msCLI parsing + config plumbing. - Updates launchd plist to enable the guard in production and adds unit tests for freshness behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| tools/shadow/shadow-observer.ts | Implements freshness guard (frontmost state query, history tracking, new verdict, new CLI flag). |
| tools/shadow/shadow-observer.test.ts | Adds tests for checkWindowFresh and verifies runOneCycle skips detection for fresh windows. |
| tools/shadow/launchd/com.zeta.shadow-observer.plist | Enables production safety via --freshness-threshold-ms 5000 and documents rationale. |
| tools/shadow/get-frontmost-state.applescript | New minimal frontmost pid+name query used by the guard. |
…y, docstring Three substantive fixes from PR #3956 review: 1. **Null-state skip (P1)** — `getFrontmostState`'s docstring contract says caller treats null as "unknown, skip cycle." Previous code only skipped when state was non-null and too-fresh; a probe failure (no accessibility permission, osascript error) let the cycle proceed into AX-heavy work anyway, undermining the guard. Now: null also returns "skipped-fresh-window" with a "frontmost-unknown" log entry. 2. **Bounded history** — `frontmostHistory` was module-level and never pruned, so a long-running launch agent could accumulate entries across pid churn. Now capped at `FRONTMOST_HISTORY_MAX = 256`; on overflow the oldest insertion is evicted (Map preserves insertion order). 3. **Docstring honesty** — `checkWindowFresh`'s comment claimed "Pure" but it mutates `history`. Replaced with side-effect-naming description. Plus 2 new tests covering both fixes. Composes with `.claude/rules/blocked-green-ci-investigate-threads.md` — findings classified, fixed where substantive, replies cite verification. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…→ merged (#3980) * docs(tick): 2349Z Otto-CLI background worker — PR #3956 thread sweep → merged PR #3956 (B-0402 shadow freshness fix #3) cleared BLOCKED → CLEAN → MERGED via 6-thread substrate-honest sweep. Four threads (1, 3, 5, 6) were stale acknowledgment work — fixes already shipped in commit bdc56f2. Thread #4 verified as false positive against source. Thread #2 acknowledged as real per-window-vs-process refinement gap, deferred to B-0402 follow-up scope. Verified each finding against source at PR head before mutating threads per blocked-green-ci-investigate-threads.md suspect-by-default discipline. GraphQL budget on session start: 136/5000 → pure-git tier. Post-work: 46/5000. Reset 23:58Z. --all-open batch poll deferred to post-reset cron tick. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(tick): correct .claude/rules link depth (5→6) per Copilot #3980 review * fix(tick): markdownlint MD038/MD056 on 2349Z.md table row 4 Bare-pipe code spans (`"|"` and `|`) confused the markdownlint table parser into counting 7 cells where 5 were expected (MD056), and the broken code-span boundaries triggered MD038 false-positives. Reworded row 4 to describe pipe handling in prose without code-span pipes. Verified locally: markdownlint-cli2 on the file exits 0. Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Fix #3 for the 2026-05-16 shadow loop bug: a pid-aware freshness threshold prevents the observer from running AX queries + restore-arrow keypresses on newly-focused terminal windows that are still in shell-init. Aaron experienced the bug as "aborted my zsh user stuff" requiring a reboot to recover.
Root cause
The disabled production plist (
com.zeta.shadow-observer.plist.disabled-2026-05-16T20-42-35Z) ran with--dry-run --restore-arrow --loop-interval 2000. The dry-run flag only suppresses the destructive Enter-to-accept;--restore-arrowruns unconditionally per the existing plist comment ("the two flags are orthogonal"). Result: every 2 seconds,restore-arrow.applescriptfired in the frontmost terminal — samplingAXValueviaAXFocusedUIElement, pressingkey code 124(right arrow), then re-sampling. On a newly-opened terminal mid-zsh init, this combination (AX queries on partially-built terminal widget + keypress injection during zsh setup) could plausibly abort init.Fix
Pid-aware freshness threshold: before any AX-heavy work in
runOneCycle, query frontmost app pid+name via a new cheap AppleScript (get-frontmost-state.applescript); skip the cycle entirely if that window has been frontmost for less thanfreshnessThresholdMs. Default 5000ms in production (set explicitly via the launchd plist's--freshness-threshold-ms 5000flag).Diff
tools/shadow/get-frontmost-state.applescript(NEW, 32 lines) — minimal pid+name query, no AX traversal, no key injectiontools/shadow/shadow-observer.ts(+120 lines):FrontmostStatetype,getFrontmostStatefn,checkWindowFreshpure fn,frontmostHistoryMap,_resetFrontmostHistorytest helper,freshnessThresholdMsconfig field, CLI flag, early-exit inrunOneCyclewith new"skipped-fresh-window"verdicttools/shadow/shadow-observer.test.ts(+52 lines):freshnessThresholdMs: 0added to existing 3 fixtures, 4 new tests for freshness behavior (66 pass total, was 62)tools/shadow/launchd/com.zeta.shadow-observer.plist(+9 lines):--freshness-threshold-ms 5000inProgramArguments; comment block explaining fix Round 27 — plugin API + governance split + memory-in-repo #3 rationaleCLI default = 0 (disabled); production default = 5000 (enabled via plist)
Tests + ad-hoc CLI invocations get deterministic first-cycle detection; production safety choice is visible in the plist config rather than hidden in a CLI default. This composes with the explicit-flag-in-deployment-config discipline.
Test plan
plutil -lintpasses (template syntax)bun tools/shadow/launchd/install-launchagent.tsto reinstall plist with new flagComposes with
detect-grey-text.applescriptlines 40-49 comment (prior failure-mode preservation — same family: querying AX of unprepared windows has side effects).claude/rules/shadow-star-shorthand-autocomplete-marker.md(the shadow infrastructure that produces(shadow*)markers when accepted suggestions get shipped)🤖 Generated with Claude Code