Skip to content

fix(desktop): v1 terminal — emit legacy marker + gate reattach on streamReady#3389

Merged
Kitenite merged 5 commits intomainfrom
prod-terminal-delay-investigation
Apr 12, 2026
Merged

fix(desktop): v1 terminal — emit legacy marker + gate reattach on streamReady#3389
Kitenite merged 5 commits intomainfrom
prod-terminal-delay-investigation

Conversation

@Kitenite
Copy link
Copy Markdown
Collaborator

@Kitenite Kitenite commented Apr 12, 2026

Two independent v1 terminal bugs surfaced together. Both are v1-only — v2 uses in-process host-service with different lifecycle semantics.

1. Shell-ready marker split across main/daemon process boundary

Prod users on v1 report a ~15s stall after a new terminal spawns. Root cause is not a scanner bug but a process boundary:

  • Main process rewrites wrappers via setupAgentHooks() on every launch — after the 133;A refactor, they instantly switch to emitting OSC 133;A.
  • Daemon is detached and only restarts on PROTOCOL_VERSION bump / auth miss / dev mtime check / explicit user action. An old v1 daemon keeps scanning for the legacy OSC 777;superset-shell-ready.

Fresh wrappers emit 133;A, old daemon never matches, shellReadyState sits at pending until the timeout.

#3387 added an OSC 777 fallback inside the daemon, which only helps the reverse case (new daemon + stale wrappers) — not the one users are actually hitting. This PR:

  • Reverts fix(desktop): accept legacy OSC 777 marker in v1 terminal #3387 (daemon-side fallback + 5s timeout shortening).
  • Emits both markers from the wrappers (zsh .zlogin precmd, bash PROMPT_COMMAND, fish fish_prompt). Either-ended compatibility — any daemon version finds its expected marker, no restart needed, fix reaches users on the next app launch.

Unknown OSC sequences are dropped silently by the emulator, so the extra marker is invisible in rendered output.

2. Reattach gate incorrectly trusts cache existence

When tabs are created rapidly, React can remount a pane before its initial createOrAttach finishes. The unmount cancels the in-flight attach but intentionally preserves the v1 terminal cache entry so visible-tab switches reuse the xterm instance.

On remount, useTerminalLifecycle.ts:237 saw a cache entry and took the reattach fast path — skipping createOrAttach and stream subscription setup — leaving the pane permanently disconnected: no daemon session, no subscription, writes routed to a session that doesn't exist.

Fix: gate on the cache entry's streamReady flag instead of mere existence. streamReady is only set after a successful attach and never reset until dispose, so it cleanly distinguishes "cache ready from a prior successful attach" from "cache exists but attach was canceled mid-flight".

Test plan

  • bun test apps/desktop/src/main/lib/agent-setup/shell-wrappers.test.ts passes (25/25)
  • bun run typecheck --filter=@superset/desktop clean
  • Manual: run against an old v1 daemon (pre-fix(desktop): accept legacy OSC 777 marker in v1 terminal #3387 build) with post-refactor wrappers — confirm shell readiness resolves <100ms, not 15s
  • Manual: new daemon + these wrappers — confirm 133;A still wins, no duplicate-ready logging
  • Manual: open 10+ terminal tabs rapidly — all connect, none stuck in permanent loading state
  • Manual: verify nothing OSC-777-shaped leaks into rendered terminal output

Follow-ups (not in this PR)

  • Reverting fix(desktop): accept legacy OSC 777 marker in v1 terminal #3387 also drops the 5s timeout shortening (squashed into the same commit). Happy to restore it separately if we still want that change.
  • Longer-term: either bump PROTOCOL_VERSION or start checking HelloResponse.daemonVersion (currently dead data) so daemon-side fixes can actually reach prod without wrapper-side compatibility shims.

Summary by CodeRabbit

  • Bug Fixes

    • Emit and detect dual shell readiness markers for broader shell compatibility.
    • Remove legacy-only readiness detection so readiness now relies on the unified marker.
    • Increase shell initialization timeout from 5s to 15s for more reliable startups.
    • Fix terminal reattach logic to avoid skipping initial stream setup when prior attach didn’t finish.
  • Tests

    • Updated and added tests to assert presence of both readiness markers and removed legacy-only cases.

…ppers

The v1 terminal-host daemon is detached and survives app updates, only
restarting on PROTOCOL_VERSION bumps. Shell wrappers, by contrast, are
rewritten by setupAgentHooks() on every main-process start. After the
133;A marker refactor, prod users end up with fresh wrappers emitting
133;A while an old v1 daemon still scans only for OSC 777 — readiness
never resolves and stdin stays buffered until the timeout.

Emit both markers from zsh/bash/fish wrappers. Any daemon version finds
its expected marker without needing a restart, so the fix reaches users
immediately on the next app launch. Unknown OSC sequences are dropped
silently by the emulator, so the extra marker is invisible in output.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6573d2dd-8be2-4753-91da-869521111a24

📥 Commits

Reviewing files that changed from the base of the PR and between 2d28085 and c15eb8e.

📒 Files selected for processing (1)
  • apps/desktop/src/main/terminal-host/session-shell-ready.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/desktop/src/main/terminal-host/session-shell-ready.test.ts

📝 Walkthrough
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes both main changes: emitting a legacy marker alongside the new one, and gating reattach on streamReady state.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering both bugs, root causes, solutions, testing approach, and follow-ups.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch prod-terminal-delay-investigation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 12, 2026

Greptile Summary

This PR fixes a ~15s stall experienced by prod users after an app update by emitting both legacy OSC 777;superset-shell-ready and the current OSC 133;A markers from every shell wrapper (zsh .zlogin, bash PROMPT_COMMAND, fish fish_prompt). The root cause was a process-boundary split: the main process rewrites wrapper files on every launch (so new wrappers immediately emit only 133;A), but the daemon is long-lived and only restarts on a protocol bump — meaning an old v1 daemon that only scans for OSC 777 never matches, leaving shellReadyState at pending until the 15 s timeout.

The complementary change reverts #3387 (daemon-side OSC 777 fallback + 5 s timeout shortening), replacing the daemon-side workaround with a wrapper-side fix that reaches users on the next app launch rather than the next daemon restart.

Key changes:

  • All three shell wrappers now printf both markers in sequence: \\033]777;superset-shell-ready\\007\\033]133;A\\007
  • session.ts removes the legacyScanState / scanForLegacyShellReady code (the companion of the reverted PR) and restores SHELL_READY_TIMEOUT_MS to 15 s
  • Test suite adds wrapper-level assertions that both markers are present, and removes the daemon-side legacy scanner tests that no longer apply

Confidence Score: 4/5

Safe to merge; the wrapper-side fix is the correct long-term solution and reaches users on the next app launch without a daemon restart

The approach is well-reasoned and correctly targets the root cause. The one non-blocking gap is that after the legacy scanner was removed, OSC 777 now passes through to the emulator unstripped during the pending phase — the PR's assertion that xterm.js silently discards it is accurate in practice, but no session-level test covers this combined-marker path. 25/25 wrapper tests pass. Score is 4 rather than 5 only because of this small coverage gap and the implicit reliance on emulator behavior for the pass-through.

apps/desktop/src/main/terminal-host/session-shell-ready.test.ts — the removed strips-legacy-marker test was the only coverage for the 777 pass-through path; a replacement test for the dual-marker scenario would close the gap

Important Files Changed

Filename Overview
apps/desktop/src/main/terminal-host/session.ts Removes legacyScanState/scanForLegacyShellReady (reverts #3387 daemon-side fallback), restores 15 s timeout; OSC 777 now passes through unstripped to the emulator during the pending phase
apps/desktop/src/main/lib/agent-setup/shell-wrappers.ts All three shells (zsh, bash, fish) now emit both OSC 777 and 133;A markers; the change is consistent and well-commented
apps/desktop/src/main/lib/agent-setup/shell-wrappers.test.ts New assertions verify both markers are present in zsh, bash, and fish wrappers; coverage is thorough at the wrapper level
apps/desktop/src/main/terminal-host/session-shell-ready.test.ts Removes legacy OSC 777 session-level tests (no longer applicable); leaves a gap: no session-level test for the new dual-marker sequence arriving together

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[App Launch] --> B[Main process rewrites shell wrappers]
    B --> C["Wrapper emits OSC 777 then OSC 133;A on every prompt"]
    C --> D{Daemon version?}
    D -->|Old v1 daemon stale| E["Scans for OSC 777 - matches immediately"]
    D -->|Current daemon| F["scanForShellReady finds OSC 133;A - matches immediately"]
    E --> G["shellReadyState = ready under 100ms"]
    F --> G
    E --> H["OSC 133;A passes through to emulator - silently dropped by xterm.js"]
    F --> I["OSC 777 passes through to emulator - silently dropped by xterm.js"]
Loading

Reviews (1): Last reviewed commit: "fix(desktop): emit legacy OSC 777 marker..." | Re-trigger Greptile

Comment on lines 270 to 272
});
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing session-level test for the dual-marker scenario

The three removed tests covered the old daemon-side scanner; the replacement wrapper-level tests (in shell-wrappers.test.ts) verify that both markers are emitted, but there is no session-level test that verifies the new combined path end-to-end.

Concretely, when the session receives \x1b]777;superset-shell-ready\x07\x1b]133;A\x07:

  1. scanForShellReady strips \x1b]133;A\x07 and sets shellReadyState = "ready".
  2. \x1b]777;superset-shell-ready\x07 is not stripped — it is enqueued via enqueueEmulatorWrite and broadcast to clients. The PR's "silently dropped by the emulator" claim is correct for xterm.js, but nothing in the test suite currently verifies this pass-through is benign.

Consider adding a test analogous to the removed "strips legacy marker from emitted output" case, but for the new combined sequence:

it("resolves readiness when both markers arrive together, passing 777 through", () => {
    const { session, proc } = createTestSession("/bin/zsh");
    spawnAndReady(session, proc);

    session.write("buffered\n");
    expect(getWrittenData(proc)).toEqual([]);

    // New wrappers emit both markers; 133;A triggers readiness
    const BOTH = "\x1b]777;superset-shell-ready\x07\x1b]133;A\x07";
    sendData(proc, `direnv output...${BOTH}prompt$ `);

    expect(getWrittenData(proc)).toEqual(["buffered\n"]);
});

This also catches any future regression where the order of the two markers is accidentally swapped (777 would no longer be in output before 133;A is consumed).

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 4 files

…ache existence

When many tabs are created rapidly, React can remount a pane before its
initial createOrAttach finishes. The unmount cancels the in-flight attach
but intentionally preserves the v1 terminal cache entry so visible-tab
switches reuse the xterm instance. On remount, the gate at line 237 saw
a cache entry and took the reattach fast path — skipping createOrAttach
and stream subscription setup — which left the pane permanently
disconnected: no daemon session, no subscription, writes routed to a
session that doesn't exist.

Gate on the cache entry's streamReady flag instead. A cache entry with
streamReady=false means the prior mount unmounted mid-attach, so the
remount should take the normal attach path and re-issue createOrAttach
with a fresh requestId. streamReady is only ever set by a successful
attach (v1-terminal-cache.ts:setStreamReady) and never reset until
dispose, so the predicate cleanly distinguishes "cache ready from a
prior successful attach" from "cache exists but attach never completed".

v1-only — v2 uses in-process host-service with a different lifecycle.
@Kitenite Kitenite changed the title fix(desktop): emit both OSC 777 and 133;A markers so v1 daemons match without restart fix(desktop): v1 terminal — emit legacy marker + gate reattach on streamReady Apr 12, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/desktop/src/main/terminal-host/session.ts`:
- Around line 364-369: The session path currently only calls
scanForShellReady(this.scanState, data) which strips OSC 133;A but leaves OSC
777 sequences in result.output, so enhance the handling in session.ts where
shellReadyState === "pending" (around scanForShellReady, scanState,
resolveShellReady) to also remove/strip OSC 777 sequences from the streamed data
before enqueuing/broadcasting TerminalDataEvent; either update scanForShellReady
to drop OSC 777 while still matching 133;A, or run a lightweight streaming
777-stripper on result.output immediately after scanForShellReady returns so no
OSC-777-shaped bytes reach downstream consumers or transcripts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 724b3717-2baa-4a62-928d-0fefc755cce0

📥 Commits

Reviewing files that changed from the base of the PR and between ef67d8a and e95208b.

📒 Files selected for processing (4)
  • apps/desktop/src/main/lib/agent-setup/shell-wrappers.test.ts
  • apps/desktop/src/main/lib/agent-setup/shell-wrappers.ts
  • apps/desktop/src/main/terminal-host/session-shell-ready.test.ts
  • apps/desktop/src/main/terminal-host/session.ts
💤 Files with no reviewable changes (1)
  • apps/desktop/src/main/terminal-host/session-shell-ready.test.ts

Comment thread apps/desktop/src/main/terminal-host/session.ts
…er output

Addresses review feedback on #3389. The three removed legacy-scanner
tests weren't replaced by a session-level test for the new wrapper
output format (both markers printf'd back-to-back). Adds one.

Guards against wrapper regressions: dropping 133;A (breaks current
scanner), or any future change to the combined marker sequence that
would prevent readiness detection.
@Kitenite Kitenite merged commit 1692806 into main Apr 12, 2026
7 checks passed
@Kitenite Kitenite deleted the prod-terminal-delay-investigation branch April 12, 2026 23:10
@github-actions
Copy link
Copy Markdown
Contributor

🧹 Preview Cleanup Complete

The following preview resources have been cleaned up:

  • ⚠️ Neon database branch
  • ⚠️ Electric Fly.io app

Thank you for your contribution! 🎉

MocA-Love pushed a commit to MocA-Love/superset that referenced this pull request Apr 13, 2026
…eamReady (superset-sh#3389)

* Revert "fix(desktop): accept legacy OSC 777 marker in v1 terminal (superset-sh#3387)"

This reverts commit ef67d8a.

* fix(desktop): emit legacy OSC 777 marker alongside 133;A in shell wrappers

The v1 terminal-host daemon is detached and survives app updates, only
restarting on PROTOCOL_VERSION bumps. Shell wrappers, by contrast, are
rewritten by setupAgentHooks() on every main-process start. After the
133;A marker refactor, prod users end up with fresh wrappers emitting
133;A while an old v1 daemon still scans only for OSC 777 — readiness
never resolves and stdin stays buffered until the timeout.

Emit both markers from zsh/bash/fish wrappers. Any daemon version finds
its expected marker without needing a restart, so the fix reaches users
immediately on the next app launch. Unknown OSC sequences are dropped
silently by the emulator, so the extra marker is invisible in output.

* fix(desktop): v1 terminal reattach gate must check streamReady, not cache existence

When many tabs are created rapidly, React can remount a pane before its
initial createOrAttach finishes. The unmount cancels the in-flight attach
but intentionally preserves the v1 terminal cache entry so visible-tab
switches reuse the xterm instance. On remount, the gate at line 237 saw
a cache entry and took the reattach fast path — skipping createOrAttach
and stream subscription setup — which left the pane permanently
disconnected: no daemon session, no subscription, writes routed to a
session that doesn't exist.

Gate on the cache entry's streamReady flag instead. A cache entry with
streamReady=false means the prior mount unmounted mid-attach, so the
remount should take the normal attach path and re-issue createOrAttach
with a fresh requestId. streamReady is only ever set by a successful
attach (v1-terminal-cache.ts:setStreamReady) and never reset until
dispose, so the predicate cleanly distinguishes "cache ready from a
prior successful attach" from "cache exists but attach never completed".

v1-only — v2 uses in-process host-service with a different lifecycle.

* test(desktop): session resolves readiness on combined 777+133;A wrapper output

Addresses review feedback on superset-sh#3389. The three removed legacy-scanner
tests weren't replaced by a session-level test for the new wrapper
output format (both markers printf'd back-to-back). Adds one.

Guards against wrapper regressions: dropping 133;A (breaks current
scanner), or any future change to the combined marker sequence that
would prevent readiness detection.

* Lint
MocA-Love added a commit to MocA-Love/superset that referenced this pull request Apr 13, 2026
- Fix OSC marker mismatch: update useTerminalModes to detect OSC 133;A
  (FinalTerm standard) in addition to legacy OSC 777, matching the
  marker change in shell-wrappers.ts (superset-sh#3389)
- Restore isAlternateScreenRef guard on commandBufferRef accumulation
  to prevent suggestion buffer pollution during vim/less usage
- Add shouldSkipSnapshot guard for new sessions in useTerminalRestore
  to prevent potential prompt double-display
- Move NORMALIZE_GITHUB_QUERY_PLAN.md from src/ to plans/ per repo rules
MocA-Love pushed a commit to MocA-Love/superset that referenced this pull request Apr 14, 2026
…ore cached

Root cause (from superset-sh#3348):

createOrAttach onSuccess eagerly called v1TerminalCache.startStream /
setStreamReady / markTerminalSessionReady before inspecting result.isColdRestore.
Cold-restore responses come back without a real backend session, yet the cache
was left in streamReady=true state. On the next mount (e.g. after a tab switch
unmount from PersistentTabRenderer — terminal panes are not persistent views),
useTerminalLifecycle's isReattach fast-path saw streamReady=true + live
subscription and skipped createOrAttach entirely. From that moment on every
keystroke was forwarded to electronTrpcClient.terminal.write with no backend
session, the main process threw, and writeRef (no onError callback) silently
swallowed the failure — the user just saw a terminal that displayed output
but refused to accept input.

Fix:

1. useTerminalLifecycle.ts: defer startStream/setStreamReady/
   markTerminalSessionReady until after the cold-restore branches return, so
   only real backend sessions flip the cache into streamReady=true.
2. useTerminalLifecycle.ts: when coldRestoreState.has(paneId) is true, force
   isReattach=false so remounts re-enter the full attach path instead of
   trusting stale streamReady. Belt-and-suspenders against the module-level
   coldRestoreState map surviving unmount.
3. useTerminalColdRestore.ts: in handleStartShell's onSuccess (which finally
   spawns a real shell for the restored pane), mark the cache/readiness
   waiters as ready. Reject readiness waiters on failure so preset paths do
   not hang forever.

FORK NOTE: this is an upstream regression in superset-sh#3348 that superset-sh#3389 only narrowed.
Revisit when upstream publishes its own fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant