Skip to content

fix(B-0402): shadow observer freshness-threshold guard (fix #3 — prevent new-console zsh abort)#3956

Merged
AceHack merged 2 commits into
mainfrom
fix/b-0402-shadow-freshness-threshold-2026-05-16
May 16, 2026
Merged

fix(B-0402): shadow observer freshness-threshold guard (fix #3 — prevent new-console zsh abort)#3956
AceHack merged 2 commits into
mainfrom
fix/b-0402-shadow-freshness-threshold-2026-05-16

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 16, 2026

Summary

Fix #3 for the 2026-05-16 shadow loop bug: a pid-aware freshness threshold prevents the observer from running AX queries + restore-arrow keypresses on newly-focused terminal windows that are still in shell-init. Aaron experienced the bug as "aborted my zsh user stuff" requiring a reboot to recover.

Root cause

The disabled production plist (com.zeta.shadow-observer.plist.disabled-2026-05-16T20-42-35Z) ran with --dry-run --restore-arrow --loop-interval 2000. The dry-run flag only suppresses the destructive Enter-to-accept; --restore-arrow runs unconditionally per the existing plist comment ("the two flags are orthogonal"). Result: every 2 seconds, restore-arrow.applescript fired in the frontmost terminal — sampling AXValue via AXFocusedUIElement, pressing key code 124 (right arrow), then re-sampling. On a newly-opened terminal mid-zsh init, this combination (AX queries on partially-built terminal widget + keypress injection during zsh setup) could plausibly abort init.

Fix

Pid-aware freshness threshold: before any AX-heavy work in runOneCycle, query frontmost app pid+name via a new cheap AppleScript (get-frontmost-state.applescript); skip the cycle entirely if that window has been frontmost for less than freshnessThresholdMs. Default 5000ms in production (set explicitly via the launchd plist's --freshness-threshold-ms 5000 flag).

Diff

  • tools/shadow/get-frontmost-state.applescript (NEW, 32 lines) — minimal pid+name query, no AX traversal, no key injection
  • tools/shadow/shadow-observer.ts (+120 lines): FrontmostState type, getFrontmostState fn, checkWindowFresh pure fn, frontmostHistory Map, _resetFrontmostHistory test helper, freshnessThresholdMs config field, CLI flag, early-exit in runOneCycle with new "skipped-fresh-window" verdict
  • tools/shadow/shadow-observer.test.ts (+52 lines): freshnessThresholdMs: 0 added to existing 3 fixtures, 4 new tests for freshness behavior (66 pass total, was 62)
  • tools/shadow/launchd/com.zeta.shadow-observer.plist (+9 lines): --freshness-threshold-ms 5000 in ProgramArguments; comment block explaining fix Round 27 — plugin API + governance split + memory-in-repo #3 rationale

CLI default = 0 (disabled); production default = 5000 (enabled via plist)

Tests + ad-hoc CLI invocations get deterministic first-cycle detection; production safety choice is visible in the plist config rather than hidden in a CLI default. This composes with the explicit-flag-in-deployment-config discipline.

Test plan

  • All 66 existing tests pass (+4 new freshness tests)
  • markdownlint clean
  • plist plutil -lint passes (template syntax)
  • Post-merge: run bun tools/shadow/launchd/install-launchagent.ts to reinstall plist with new flag
  • Post-install: verify new terminal opens without zsh-init interference

Composes with

  • B-0402 (shadow observer original substrate)
  • The 2026-05-16 disabled plist (root-cause evidence)
  • The existing detect-grey-text.applescript lines 40-49 comment (prior failure-mode preservation — same family: querying AX of unprepared windows has side effects)
  • .claude/rules/shadow-star-shorthand-autocomplete-marker.md (the shadow infrastructure that produces (shadow*) markers when accepted suggestions get shipped)

🤖 Generated with Claude Code

…sh abort)

2026-05-16 bug: with the disabled production plist's flags
(--dry-run --restore-arrow --loop-interval 2000), the shadow loop
queried AX + pressed → in the frontmost terminal every 2s. On a
NEWLY-OPENED terminal mid-zsh init, the AX queries + keypress could
interfere with zsh interactive startup (Aaron experienced this and
had to reboot to recover — "aborted my zsh user stuff").

Fix #3 (Aaron-picked option of three): pid-aware freshness threshold.
Before any AX-heavy work in runOneCycle, query frontmost app pid+name
via the new cheap get-frontmost-state.applescript; skip the cycle if
that window has been frontmost for less than freshnessThresholdMs.

Production plist sets --freshness-threshold-ms 5000 explicitly.
CLI default remains 0 (disabled) so test fixtures + ad-hoc CLI invocations
see deterministic detection on first cycle — the production safety
choice is visible in the plist config rather than hidden in a CLI default.

Changes:
- get-frontmost-state.applescript: NEW (15 lines) — minimal pid+name query
- shadow-observer.ts: +FrontmostState type, +getFrontmostState fn,
  +checkWindowFresh pure fn, +frontmostHistory Map, +_resetFrontmostHistory
  test helper, +freshnessThresholdMs config field, +CLI flag, +early-exit
  in runOneCycle with new "skipped-fresh-window" verdict
- shadow-observer.test.ts: +freshnessThresholdMs:0 to existing fixtures,
  +4 new tests for freshness behavior (66 pass total, was 62)
- launchd plist: +--freshness-threshold-ms 5000 in ProgramArguments,
  + plist comment block explaining the fix's rationale

Aaron's directive: "go with (a) (shadow*) maybe you can do whatever
now i now how to fix it if it's an issue" — option (a) was implement
fix #3 before re-enabling.

Re-enabling launchd is a separate follow-up — once this lands on main,
run `bun tools/shadow/launchd/install-launchagent.ts` to reinstall the
template with the new flag.

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 16, 2026 22:20
@AceHack AceHack enabled auto-merge (squash) May 16, 2026 22:20
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 68d001e250

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tools/shadow/shadow-observer.ts
Comment thread tools/shadow/shadow-observer.ts
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a pid-aware “freshness threshold” guard to prevent AX queries + key injection from running immediately after a terminal window becomes frontmost (mitigating the reported zsh init abort), with a production-default enabled via launchd.

Changes:

  • Introduces a cheap AppleScript to read frontmost app pid+name and a history-based freshness check to skip “too fresh” windows.
  • Extends runOneCycle with an early “skipped-fresh-window” verdict and adds --freshness-threshold-ms CLI parsing + config plumbing.
  • Updates launchd plist to enable the guard in production and adds unit tests for freshness behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
tools/shadow/shadow-observer.ts Implements freshness guard (frontmost state query, history tracking, new verdict, new CLI flag).
tools/shadow/shadow-observer.test.ts Adds tests for checkWindowFresh and verifies runOneCycle skips detection for fresh windows.
tools/shadow/launchd/com.zeta.shadow-observer.plist Enables production safety via --freshness-threshold-ms 5000 and documents rationale.
tools/shadow/get-frontmost-state.applescript New minimal frontmost pid+name query used by the guard.

Comment thread tools/shadow/shadow-observer.ts
Comment thread tools/shadow/shadow-observer.ts
Comment thread tools/shadow/shadow-observer.ts
Comment thread tools/shadow/shadow-observer.ts Outdated
…y, docstring

Three substantive fixes from PR #3956 review:

1. **Null-state skip (P1)** — `getFrontmostState`'s docstring contract says
   caller treats null as "unknown, skip cycle." Previous code only skipped
   when state was non-null and too-fresh; a probe failure (no accessibility
   permission, osascript error) let the cycle proceed into AX-heavy work
   anyway, undermining the guard. Now: null also returns
   "skipped-fresh-window" with a "frontmost-unknown" log entry.

2. **Bounded history** — `frontmostHistory` was module-level and never
   pruned, so a long-running launch agent could accumulate entries across
   pid churn. Now capped at `FRONTMOST_HISTORY_MAX = 256`; on overflow
   the oldest insertion is evicted (Map preserves insertion order).

3. **Docstring honesty** — `checkWindowFresh`'s comment claimed "Pure"
   but it mutates `history`. Replaced with side-effect-naming description.

Plus 2 new tests covering both fixes.

Composes with `.claude/rules/blocked-green-ci-investigate-threads.md` —
findings classified, fixed where substantive, replies cite verification.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@AceHack AceHack merged commit 3cad455 into main May 16, 2026
30 checks passed
@AceHack AceHack deleted the fix/b-0402-shadow-freshness-threshold-2026-05-16 branch May 16, 2026 23:51
AceHack added a commit that referenced this pull request May 17, 2026
…→ merged (#3980)

* docs(tick): 2349Z Otto-CLI background worker — PR #3956 thread sweep → merged

PR #3956 (B-0402 shadow freshness fix #3) cleared BLOCKED → CLEAN →
MERGED via 6-thread substrate-honest sweep. Four threads (1, 3, 5, 6)
were stale acknowledgment work — fixes already shipped in commit
bdc56f2. Thread #4 verified as false positive against source. Thread
#2 acknowledged as real per-window-vs-process refinement gap, deferred
to B-0402 follow-up scope.

Verified each finding against source at PR head before mutating
threads per blocked-green-ci-investigate-threads.md suspect-by-default
discipline.

GraphQL budget on session start: 136/5000 → pure-git tier. Post-work:
46/5000. Reset 23:58Z. --all-open batch poll deferred to post-reset
cron tick.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(tick): correct .claude/rules link depth (5→6) per Copilot #3980 review

* fix(tick): markdownlint MD038/MD056 on 2349Z.md table row 4

Bare-pipe code spans (`"|"` and `|`) confused the markdownlint table
parser into counting 7 cells where 5 were expected (MD056), and the
broken code-span boundaries triggered MD038 false-positives. Reworded
row 4 to describe pipe handling in prose without code-span pipes.

Verified locally: markdownlint-cli2 on the file exits 0.

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants