Skip to content

fix(host-service): unfreeze terminals wedged by flow control with no renderer attached#5006

Open
AviPeltz wants to merge 1 commit into
mainfrom
debug-terminal-input-buff
Open

fix(host-service): unfreeze terminals wedged by flow control with no renderer attached#5006
AviPeltz wants to merge 1 commit into
mainfrom
debug-terminal-input-buff

Conversation

@AviPeltz
Copy link
Copy Markdown
Collaborator

@AviPeltz AviPeltz commented May 31, 2026

Summary

Fixes a permanent terminal freeze introduced by the output flow-control work in #4896.

That PR pauses the PTY when the pty-daemon's unacked-byte counter crosses the 100 KB high watermark, resuming only once renderer acks drain it below the 5 KB low watermark. The accounting has a gap:

  • The daemon counts every byte it sends to host-service (Server.ts increments flowControlUnacked per output chunk).
  • The renderer is the only thing that acks, and only for bytes that reach an open socket (after xterm.write's callback fires).

When no renderer socket is open — a reconnect gap after a WS drop, a startup output burst before the renderer attaches, or a session running with no pane currently showing it — output still streams daemon → host-service, where bufferOutput() retains only the last 64 KB (MAX_BUFFER_BYTES, older bytes evicted). Those bytes never reach a renderer, so they're never acked. The daemon's counter climbs past the high watermark and pauses the PTY. On re-attach the renderer can only ack the ≤64 KB that survived eviction, so the counter stays above the low watermark and the PTY never resumes — the terminal is frozen until the session is recreated.

Note: backgrounding a pane does not trigger this — registry.detach() keeps the socket open. It's specifically the windows where there is genuinely no open socket.

Fix

When broadcast finds no open socket, host-service has still consumed the bytes into its bounded replay buffer, so it now credits the daemon's flow-control counter immediately instead of waiting for a renderer ack that will never come:

if (broadcastBytes(session, bytes) === 0) {
    bufferOutput(session, bytes);
    session.pty.ackOutput(bytes.byteLength); // host-service consumed them; daemon would otherwise stay paused
}

This is the only path that left a terminal permanently wedged. The open-socket case (transient back-pressure during a heavy burst you're actively watching) already self-heals via maybeResume() and is unchanged.

Known, harmless side effect

Buffered bytes are acked again when replayed to a reconnecting renderer (which can't distinguish replay from live). The daemon clamps the counter at Math.max(0, …), so the double-ack can't underflow — it only loosens back-pressure by up to ~64 KB for a single re-attach window. Bounded and transient; no OOM risk.

Test plan

  • bun run lint — clean
  • bun run typecheck --filter=@superset/host-service — passes
  • New regression test in terminal.adoption.node-test.ts: ~195 KB produced with no socket attached must not leave the PTY flow-control paused (hangs without the fix, since the daemon pauses at 100 KB and never resumes)

Relates to #4896.


Open in Stage

Summary by cubic

Fixes terminals freezing when output is produced with no renderer socket attached. host-service now acks buffered bytes to the daemon so the PTY resumes instead of staying paused.

  • Bug Fixes
    • When broadcastBytes(...) === 0, buffer the bytes and call session.pty.ackOutput(bytes.byteLength) to prevent unacked-byte buildup with no socket open.
    • Add regression test in packages/host-service/src/terminal/terminal.adoption.node-test.ts that writes ~195 KB with no renderer and verifies the daemon is not left paused.

Written for commit 8b01d2e. Summary will update on new commits.

Review in cubic

Summary by CodeRabbit

  • Bug Fixes

    • Terminal flow-control now properly credits daemon output even when no renderer is connected, preventing session stalls during very high-volume output and ensuring buffered output is replayed when renderers reconnect.
  • Tests

    • Added end-to-end test coverage for terminal adoption with large continuous output to prevent regressions and validate reliable buffering and completion detection.

@stage-review
Copy link
Copy Markdown

stage-review Bot commented May 31, 2026

Ready to review this PR? Stage has broken it down into 2 individual chapters for you:

Title
1 Acknowledge buffered output to prevent PTY freeze
2 Add regression test for background output flow control
Open in Stage

Chapters generated by Stage for commit 8b01d2e on May 31, 2026 2:34am UTC.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 31, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0c2544fa-db0b-4558-b46e-a54f5a26f2b3

📥 Commits

Reviewing files that changed from the base of the PR and between 1c02e08 and 8b01d2e.

📒 Files selected for processing (2)
  • packages/host-service/src/terminal/terminal.adoption.node-test.ts
  • packages/host-service/src/terminal/terminal.ts
✅ Files skipped from review due to trivial changes (1)
  • packages/host-service/src/terminal/terminal.adoption.node-test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/host-service/src/terminal/terminal.ts

📝 Walkthrough

Walkthrough

The host now calls session.pty.ackOutput(bytes.byteLength) when broadcastBytes finds no open WebSocket renderers, while still buffering bytes for replay. An adoption test writes large background PTY output, waits for a completion marker in the session buffer, and verifies flow control is not left paused.

Changes

Flow Control Acknowledgment

Layer / File(s) Summary
Flow-control acknowledgment when renderers offline
packages/host-service/src/terminal/terminal.ts
When broadcastBytes(session, bytes) reports zero open WebSocket renderers, the host buffers the bytes for replay and calls session.pty.ackOutput(bytes.byteLength) to credit the daemon's flow-control counter.
Adoption test for background output handling
packages/host-service/src/terminal/terminal.adoption.node-test.ts
Adds an adoption-regression test that writes a high-volume PTY loop ending with a unique marker, waits up to 10s for that marker in the session buffer, and disposes the session. Introduces sessionBufferText helper to concatenate Uint8Array chunks into a UTF-8 string for assertions.

Sequence Diagram

sequenceDiagram
  participant Daemon
  participant Host
  participant WebSocketRenderer
  participant SessionBuffer
  Daemon->>Host: PTY output bytes
  Host->>WebSocketRenderer: broadcastBytes(session, bytes) (0 sends)
  Host->>SessionBuffer: buffer bytes for replay
  Host->>Daemon: session.pty.ackOutput(bytes.byteLength)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • superset-sh/superset#4896: Both PRs modify the terminal output flow-control/ack mechanism; this change credits daemon output when no renderer acknowledges bytes.

Poem

🐰 I buffered the bytes while the renderers slept,
Counted the credits so daemons adept—
No stuck little streams, no pause in the night,
I hop in and ack so the flow keeps its flight. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the bug fix—unfreezing terminals wedged by flow control with no renderer attached—which directly matches the main changes in the PR.
Description check ✅ Passed The description is comprehensive, covering the problem, root cause, fix, side effects, and test plan. It includes context from related PR #4896 and explains the flow-control accounting gap that caused the freeze.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch debug-terminal-input-buff

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@capy-ai
Copy link
Copy Markdown

capy-ai Bot commented May 31, 2026

Capy auto-review is paused for this organization because the monthly auto-review limit has been reached. Increase the limit or turn it off in billing settings to resume automatic reviews.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 31, 2026

Greptile Summary

This PR fixes a permanent terminal freeze by immediately crediting the daemon's flow-control counter when output is consumed into the host-service replay buffer with no renderer socket attached. The root cause was that bytes flowing daemon → host-service with no open socket were buffered (and partially evicted after 64 KB) but never acked, so the daemon's counter climbed past the 100 KB high watermark and paused the PTY indefinitely.

  • Core fix (terminal.ts): After bufferOutput(), calls session.pty.ackOutput(bytes.byteLength) when broadcastBytes() returns 0. This prevents the daemon's unacked-byte counter from building up when no renderer is attached, covering reconnect gaps, startup bursts, and unviewed sessions.
  • Regression test (terminal.adoption.node-test.ts): Writes ~195 KB with no socket attached and asserts the completion marker eventually lands in the replay buffer, which hangs without the fix once the daemon pauses at the 100 KB high watermark.

Confidence Score: 4/5

Safe to merge; the fix is a single targeted line that prevents a permanent PTY freeze when no renderer is attached, and the only side effect (double-ack on replay) is bounded by the daemon's Math.max(0, …) clamp.

The change is minimal and well-understood: one call to ackOutput added in the no-open-socket branch of onOutput. The known double-ack on replay is explicitly clamped by the daemon and cannot underflow. A new regression test covers the exact failure scenario. No other code paths are touched.

No files require special attention; both changed files have a small, self-contained scope.

Important Files Changed

Filename Overview
packages/host-service/src/terminal/terminal.ts Adds session.pty.ackOutput(bytes.byteLength) in the no-open-socket branch; fix is minimal and logically sound, with a known/bounded double-ack on replay acknowledged in the PR description.
packages/host-service/src/terminal/terminal.adoption.node-test.ts New regression test writes ~195 KB to a socketless session and waits for a marker in the replay buffer; correctly targets the flow-control freeze path and validates the fix end-to-end.

Sequence Diagram

sequenceDiagram
    participant D as Daemon
    participant HS as Host-Service
    participant R as Renderer

    Note over D,R: No renderer socket attached

    loop Output chunks
        D->>HS: onOutput(chunk)
        HS->>HS: broadcastBytes() returns 0
        HS->>HS: bufferOutput(chunk)
        HS->>D: ackOutput(chunk.byteLength) NEW
    end

    Note over D,R: Renderer reconnects
    R->>HS: WebSocket attach
    HS->>R: replayBuffer sends buffered bytes
    R->>HS: output-ack for replay bytes
    HS->>D: ackOutput(replay bytes) double-ack clamped at zero
Loading

Reviews (1): Last reviewed commit: "fix(host-service): credit daemon flow co..." | Re-trigger Greptile

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Re-trigger cubic

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 31, 2026

🚀 Preview Deployment

🔗 Preview Links

Service Status Link
Neon Database (Neon) View Branch
Vercel API (Vercel) Open Preview
Vercel Web (Vercel) Open Preview
Vercel Marketing (Vercel) Open Preview
Vercel Admin (Vercel) Open Preview
Vercel Docs (Vercel) Open Preview

Preview updates automatically with new commits

…ached

Terminal output flow control (#4896) pauses the PTY when the daemon's
unacked-bytes counter crosses the high watermark, and only resumes once
renderer acks drain it below the low watermark. But the daemon counts
every byte it sends to host-service, while the renderer is the only
thing that acks — and only for bytes that reach an open socket.

When no renderer socket is open (reconnect gap, pre-attach startup
burst, or a session running with no pane showing it), output still
streams daemon -> host-service, where it lands in the bounded 64 KB
replay buffer (older bytes evicted). Those bytes are never delivered to
a renderer, so they are never acked. The daemon's counter climbs past
the high watermark and the PTY pauses; on re-attach the renderer can
only ack the <=64 KB that survived eviction, leaving the counter stuck
above the low watermark. The PTY never resumes — the terminal is
permanently frozen until the session is recreated.

Fix: when broadcast finds no open socket, host-service has still
consumed the bytes into its bounded buffer, so credit the daemon's
flow-control counter immediately instead of waiting for a renderer ack
that will never come. This is the only path that left a terminal
permanently wedged.

Note: buffered bytes are also acked again when replayed to a
reconnecting renderer (it can't distinguish replay from live). The
daemon clamps the counter at >= 0, so the double-ack is harmless beyond
briefly loosening back-pressure for one re-attach window.

Adds a regression test: ~195 KB produced with no socket attached must
not leave the PTY flow-control paused.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant