Skip to content

[codex] Improve terminal connection diagnostics#3801

Merged
Kitenite merged 1 commit intomainfrom
debug-terminal-connection
Apr 27, 2026
Merged

[codex] Improve terminal connection diagnostics#3801
Kitenite merged 1 commit intomainfrom
debug-terminal-connection

Conversation

@Kitenite
Copy link
Copy Markdown
Collaborator

@Kitenite Kitenite commented Apr 27, 2026

Summary

  • surface terminal.ensureSession failures directly in the v2 terminal
  • include sanitized terminal WebSocket endpoint and close code/reason in visible terminal connection errors
  • make host-service "session not found" terminal attach error explain missing session and workspaceId context

Root Cause

The terminal startup path could collapse distinct failures into generic messages like [terminal] websocket error or Session not found, hiding whether the failure came from ensureSession, host-service session lookup, or the WebSocket transport.

Impact

No terminal startup/control-flow behavior changed. This is diagnostics-only; v2 still waits for ensureSession before connecting.

Validation

  • bun run lint:fix
  • bun run test
  • bun run --cwd packages/host-service typecheck
  • bun run --cwd apps/desktop typecheck

Notes

  • Raw root bun test fails because it does not load the desktop package Bun preload setup (apps/desktop/bunfig.toml); the repo test script bun run test passes.

Summary by cubic

Improves terminal connection diagnostics in the v2 terminal. Errors now identify the failing step and show a sanitized WebSocket endpoint with close details; runtime behavior is unchanged.

  • New Features
    • Surface terminal.ensureSession failures in-terminal, including session creation errors and request failures.
    • On WebSocket error or abnormal close, show sanitized endpoint and close code/reason, and whether reconnecting or max retries reached.
    • In host-service, replace generic “Session not found” with a terminalId-aware message suggesting terminal.ensureSession or workspaceId, and use a clearer close reason.

Written for commit a290a00. Summary will update on new commits. Review in cubic

Summary by CodeRabbit

  • Bug Fixes
    • Terminal now displays clear failure messages when session creation fails or is rejected.
    • WebSocket error logs include the target endpoint and actionable guidance instead of generic text.
    • Close events produce detailed terminal logs with close code, optional reason, and reconnection outcome messages.
    • Server-side connection errors now return a more specific "Terminal session not found" error with instructions to retry with proper session info.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 27, 2026

📝 Walkthrough

Walkthrough

Adds detailed WebSocket endpoint and close-event logging in the renderer transport, surfaces terminal session creation failures in the UI, and returns a more specific error message from the host-service when a requested terminal session is missing.

Changes

Cohort / File(s) Summary
Terminal WebSocket Transport Logging
apps/desktop/src/renderer/lib/terminal/terminal-ws-transport.ts
Adds helpers to format WebSocket endpoints and close-event details. Enhances "close" handler to accept a CloseEvent, set connection state to "closed", null the socket, and write a detailed terminal log (endpoint, code, optional reason) unless the transport was exited or code is 1000. Updates "error" handler to log formatted endpoint and guidance instead of a generic message; reconnection scheduling remains.
Terminal Pane Session Error Reporting
apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/.../TerminalPane/TerminalPane.tsx
Changes ensureSession handling to write visible failure lines into the terminal UI when session creation is not "active" or the mutation rejects. Includes result.error when present or a fallback unknown-reason string; preserves existing console.error logging and early-exits on success or when cancelled.
Terminal Service Error Messages
packages/host-service/src/terminal/terminal.ts
When a WebSocket connects for /terminal/:terminalId but no session exists and v1 fallback lacks workspaceId, sends a type: "error" message including the missing terminalId and guidance to use terminal.ensureSession or provide workspaceId. Changes close reason string to "Terminal session not found" while keeping the same close code and control flow.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐇 I hopped to the socket, ears all a-flutter,
Found codes and reasons once lost in the clutter.
Now endpoints sing and failures show bright,
A rabbit's small hop brought logs to the light. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main objective: improving terminal connection diagnostics across all three modified files.
Description check ✅ Passed The description includes a clear summary of changes, root cause analysis, impact assessment, and comprehensive validation steps, covering all critical template sections.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch debug-terminal-connection

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 27, 2026

Greptile Summary

This is a diagnostics-only PR that surfaces previously hidden terminal connection failures as in-terminal messages — no startup or control-flow behavior changes. It adds formatWsEndpoint (query-param-stripped URL) and formatCloseDetails to WebSocket close/error events, writes ensureSession failures directly into the terminal widget, and replaces the host-service "Session not found" error with a message that names the terminal ID and explains the missing workspaceId context.

Confidence Score: 5/5

Safe to merge — all changes are diagnostic messages with no control-flow impact.

The only finding is P2: the "Reconnecting…" suffix in the close-event message can be inaccurate at max reconnect attempts, but it does not affect any behavior. All three files are well-guarded (optional chaining, cancelled flag, query-param stripping) and the PR's stated validation (lint, tests, typecheck) passes.

No files require special attention.

Important Files Changed

Filename Overview
apps/desktop/src/renderer/lib/terminal/terminal-ws-transport.ts Adds formatWsEndpoint (strips query params for safe display) and formatCloseDetails; improves close/error event messages. Minor UX issue: "Reconnecting…" is shown even when max reconnect attempts are already reached.
apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/$workspaceId/hooks/usePaneRegistry/components/TerminalPane/TerminalPane.tsx Adds explicit return after active-session invalidation and surfaces ensureSession failures (non-active result or thrown error) directly in the terminal widget. Optional chaining guards handle the unmounted-terminal case correctly.
packages/host-service/src/terminal/terminal.ts Replaces the terse "Session not found" error with a detailed message that names the terminal ID and explains the missing workspaceId context. Close reason updated to "Terminal session not found" for consistency.

Sequence Diagram

sequenceDiagram
    participant R as TerminalPane (renderer)
    participant T as terminalRuntimeRegistry
    participant H as host-service

    R->>H: tRPC terminal.ensureSession
    alt status === active
        H-->>R: { status: active }
        R->>R: invalidate session list and return
    else status !== active (NEW)
        H-->>R: { status: ..., error?: ... }
        R->>T: writeln([terminal] Failed to create terminal session: ...)
    end
    Note over R: .finally() always runs connect()
    R->>T: connect(terminalId, wsUrl)
    T->>H: WebSocket /terminal/:id
    alt session found
        H-->>T: open + replay
    else session not found, no workspaceId (IMPROVED msg)
        H-->>T: error message with terminalId + workspaceId context
        H-->>T: ws.close(1011, Terminal session not found)
        T->>T: writeln([terminal] WebSocket closed code 1011 reason ...) NEW
    end
    alt WebSocket error
        H-->>T: error event (IMPROVED msg)
        T->>T: writeln([terminal] WebSocket error while connecting to endpoint)
    end
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: apps/desktop/src/renderer/lib/terminal/terminal-ws-transport.ts
Line: 188-192

Comment:
**"Reconnecting…" message misleads when max attempts are exhausted**

When the WebSocket closes unexpectedly, the new message always ends with `"Reconnecting..."`, but `scheduleReconnect` silently returns without scheduling anything once `_reconnectAttempt >= MAX_RECONNECT_ATTEMPTS` (10). The user sees "Reconnecting…" but no further reconnection ever happens, giving a false impression that recovery is in progress.

Consider making the terminal message conditional on the remaining attempt budget:

```suggestion
		if (!transport._exited && event.code !== 1000) {
			const willReconnect =
				transport._reconnectAttempt < MAX_RECONNECT_ATTEMPTS;
			terminal.writeln(
				`\r\n[terminal] WebSocket closed while connected to ${formatWsEndpoint(transport.currentUrl)} (${formatCloseDetails(event)}). ${willReconnect ? "Reconnecting..." : "Max reconnect attempts reached."}`,
			);
		}
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "Improve terminal connection diagnostics" | Re-trigger Greptile

@Kitenite Kitenite force-pushed the debug-terminal-connection branch from 361f35e to d478bbc Compare April 27, 2026 17:23
Comment thread apps/desktop/src/renderer/lib/terminal/terminal-ws-transport.ts
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 27, 2026

🚀 Preview Deployment

🔗 Preview Links

Service Status Link
Neon Database (Neon) View Branch
Vercel API (Vercel) Open Preview
Vercel Web (Vercel) Open Preview
Vercel Marketing (Vercel) Open Preview
Vercel Admin (Vercel) Open Preview
Vercel Docs (Vercel) Open Preview

Preview updates automatically with new commits

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/desktop/src/renderer/lib/terminal/terminal-ws-transport.ts">

<violation number="1" location="apps/desktop/src/renderer/lib/terminal/terminal-ws-transport.ts:190">
P2: The message unconditionally says "Reconnecting..." but `scheduleReconnect` silently no-ops once `_reconnectAttempt >= MAX_RECONNECT_ATTEMPTS`. After the last allowed attempt, this gives a false impression that recovery is in progress. Make the message conditional on remaining attempts.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread apps/desktop/src/renderer/lib/terminal/terminal-ws-transport.ts Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
packages/host-service/src/terminal/terminal.ts (1)

597-609: Diagnostic improvements look good; minor suggestion to align with PR description.

The new error message clearly identifies the missing session by terminalId and points at the missing workspaceId fallback, which is a useful improvement over the prior generic "Session not found". That said, the PR description states this path should also "suggest calling terminal.ensureSession or providing workspaceId", but the rendered error only mentions the missing fallback. Consider mentioning ensureSession explicitly so v1 callers see actionable guidance:

Suggested wording tweak
-								message: `Terminal session "${terminalId}" not found; missing workspaceId fallback.`,
+								message: `Terminal session "${terminalId}" not found. Call terminal.ensureSession before attaching, or provide a workspaceId query parameter.`,

Note: WebSocket close reasons are limited to 123 UTF-8 bytes (RFC 6455), so keep the longer wording in the JSON error payload and leave the close() reason short as it currently is.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/host-service/src/terminal/terminal.ts` around lines 597 - 609,
Update the JSON error payload sent when a session is missing to explicitly
suggest calling terminal.ensureSession or providing a workspaceId; specifically
modify the block that checks sessions.get(terminalId) (uses terminalId,
c.req.query, sendMessage, ws.close) so the sendMessage(...) error message
includes guidance like "call terminal.ensureSession or provide workspaceId",
while keeping the ws.close(1011, "Terminal session not found") short per RFC
limits.
apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/$workspaceId/hooks/usePaneRegistry/components/TerminalPane/TerminalPane.tsx (1)

153-185: Surfacing ensureSession failures in-terminal LGTM; minor nit on cancellation.

Resolving result.error to either : <reason> or " for an unknown reason", plus a separate prefix for rejections, gives users a much clearer picture than the prior silent-failure path.

One small consideration: cancelled is only checked in .finally before calling connect, but not in .then/.catch. If the pane unmounts between mutateAsync resolving and the finally running, you'll still call writeln on whatever the registry returns for that terminalId/terminalInstanceId. The ?.writeln chain makes this safe in practice (the registry returns null after detach), so this is a nit rather than a bug — but if you want strict symmetry with the connect guard, gating the writelns on !cancelled would make the intent explicit:

Optional: gate diagnostic writes on `cancelled`
 			.then((result) => {
 				if (result.status === "active") {
 					void invalidateTerminalSessionsRef.current({
 						workspaceId: sessionWorkspaceId,
 					});
 					return;
 				}
+				if (cancelled) return;
 				const details = result.error
 					? `: ${result.error}`
 					: " for an unknown reason";
 				terminalRuntimeRegistry
 					.getTerminal(terminalId, terminalInstanceId)
 					?.writeln(
 						`\r\n[terminal] Failed to create terminal session${details}`,
 					);
 			})
 			.catch((err) => {
 				console.error("[TerminalPane] ensureSession failed:", err);
+				if (cancelled) return;
 				const message = err instanceof Error ? err.message : String(err);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/`$workspaceId/hooks/usePaneRegistry/components/TerminalPane/TerminalPane.tsx
around lines 153 - 185, The current ensureSession handling calls
terminalRuntimeRegistry.getTerminal(...)? .writeln in both the .then and .catch
branches even if the operation was cancelled; add a guard on the cancelled flag
before performing those diagnostic writeln calls (the same way .finally gates
connect) so that in the promise resolution/rejection paths you return early when
cancelled is true, e.g. check !cancelled before computing details and calling
terminalRuntimeRegistry.getTerminal(terminalId, terminalInstanceId)?.writeln in
the .then and before the terminalRuntimeRegistry.getTerminal(...)? .writeln in
the .catch; keep the existing optional chaining to remain safe.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/`$workspaceId/hooks/usePaneRegistry/components/TerminalPane/TerminalPane.tsx:
- Around line 153-185: The current ensureSession handling calls
terminalRuntimeRegistry.getTerminal(...)? .writeln in both the .then and .catch
branches even if the operation was cancelled; add a guard on the cancelled flag
before performing those diagnostic writeln calls (the same way .finally gates
connect) so that in the promise resolution/rejection paths you return early when
cancelled is true, e.g. check !cancelled before computing details and calling
terminalRuntimeRegistry.getTerminal(terminalId, terminalInstanceId)?.writeln in
the .then and before the terminalRuntimeRegistry.getTerminal(...)? .writeln in
the .catch; keep the existing optional chaining to remain safe.

In `@packages/host-service/src/terminal/terminal.ts`:
- Around line 597-609: Update the JSON error payload sent when a session is
missing to explicitly suggest calling terminal.ensureSession or providing a
workspaceId; specifically modify the block that checks sessions.get(terminalId)
(uses terminalId, c.req.query, sendMessage, ws.close) so the sendMessage(...)
error message includes guidance like "call terminal.ensureSession or provide
workspaceId", while keeping the ws.close(1011, "Terminal session not found")
short per RFC limits.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a90e653e-aebd-4765-9d1a-a3b1a3ae2e39

📥 Commits

Reviewing files that changed from the base of the PR and between 2183bcf and d478bbc.

📒 Files selected for processing (3)
  • apps/desktop/src/renderer/lib/terminal/terminal-ws-transport.ts
  • apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/$workspaceId/hooks/usePaneRegistry/components/TerminalPane/TerminalPane.tsx
  • packages/host-service/src/terminal/terminal.ts

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
apps/desktop/src/renderer/lib/terminal/terminal-ws-transport.ts (2)

188-199: Optional: derive willReconnect from scheduleReconnect instead of duplicating its gate.

willReconnect here re-implements the same predicates scheduleReconnect checks (_reconnectTimer, currentUrl/_terminal set, _reconnectAttempt < MAX_RECONNECT_ATTEMPTS). They're correct today, but if scheduleReconnect's gating ever changes (e.g., a new bail-out condition), the user-facing message will silently drift from the actual behavior. Consider having scheduleReconnect return whether it scheduled, and use that to pick between "Reconnecting..." vs "Max reconnect attempts reached."

♻️ Sketch
-function scheduleReconnect(transport: TerminalTransport) {
-	if (transport._reconnectTimer) return;
-	if (transport._exited) return;
-	if (!transport.currentUrl || !transport._terminal) return;
-	if (transport._reconnectAttempt >= MAX_RECONNECT_ATTEMPTS) return;
+function scheduleReconnect(transport: TerminalTransport): boolean {
+	if (transport._reconnectTimer) return false;
+	if (transport._exited) return false;
+	if (!transport.currentUrl || !transport._terminal) return false;
+	if (transport._reconnectAttempt >= MAX_RECONNECT_ATTEMPTS) return false;
@@
+	return true;
 }
@@
-		if (!transport._exited && event.code !== 1000) {
-			const willReconnect =
-				!transport._reconnectTimer &&
-				Boolean(transport.currentUrl && transport._terminal) &&
-				transport._reconnectAttempt < MAX_RECONNECT_ATTEMPTS;
-			terminal.writeln(
-				`\r\n[terminal] WebSocket closed while connected to ${formatWsEndpoint(transport.currentUrl)} (${formatCloseDetails(event)}). ${willReconnect ? "Reconnecting..." : "Max reconnect attempts reached."}`,
-			);
-		}
-		// Auto-reconnect on unexpected close (host-service restart, network blip)
-		scheduleReconnect(transport);
+		const scheduled = scheduleReconnect(transport);
+		if (!transport._exited && event.code !== 1000) {
+			terminal.writeln(
+				`\r\n[terminal] WebSocket closed while connected to ${formatWsEndpoint(transport.currentUrl)} (${formatCloseDetails(event)}). ${scheduled ? "Reconnecting..." : "Max reconnect attempts reached."}`,
+			);
+		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/desktop/src/renderer/lib/terminal/terminal-ws-transport.ts` around lines
188 - 199, The message computes willReconnect locally duplicating
scheduleReconnect's gating; change scheduleReconnect(transport) to return a
boolean indicating whether it scheduled a reconnect, then call const didSchedule
= scheduleReconnect(transport) and use didSchedule to decide the
terminal.writeln text instead of recomputing transport._reconnectTimer /
transport.currentUrl / transport._terminal / transport._reconnectAttempt <
MAX_RECONNECT_ATTEMPTS; update scheduleReconnect's callers accordingly so the
single source of truth for reconnect gating is the scheduleReconnect function
(referenced symbols: scheduleReconnect, willReconnect,
transport._reconnectTimer, transport.currentUrl, transport._terminal,
transport._reconnectAttempt, MAX_RECONNECT_ATTEMPTS, terminal.writeln,
formatWsEndpoint, formatCloseDetails).

201-206: Note: error event has no actionable detail; consider deduping with the close log.

Browser WebSocketEvents on error carry no usable info, and a failed connection always fires both error and close. With this change, every transport failure now writes two [terminal] WebSocket ... lines (error + close). Not incorrect, but consider whether you want to suppress this line when a close will follow within the same task to keep the terminal output less noisy. Up to you.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/desktop/src/renderer/lib/terminal/terminal-ws-transport.ts` around lines
201 - 206, The error handler currently always writes a terminal line and causes
a duplicate message because a subsequent close also logs; instead, dedupe by
having the error handler set a transient flag (e.g., transport._sawSocketError)
and NOT call terminal.writeln, then have the existing socket "close" handler
consult that flag and emit a single consolidated terminal.writeln using
formatWsEndpoint(transport.currentUrl); ensure you clear the flag after use so
future connections behave normally (update socket.addEventListener("error",
...), the transport object, and the socket close handler accordingly).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@apps/desktop/src/renderer/lib/terminal/terminal-ws-transport.ts`:
- Around line 188-199: The message computes willReconnect locally duplicating
scheduleReconnect's gating; change scheduleReconnect(transport) to return a
boolean indicating whether it scheduled a reconnect, then call const didSchedule
= scheduleReconnect(transport) and use didSchedule to decide the
terminal.writeln text instead of recomputing transport._reconnectTimer /
transport.currentUrl / transport._terminal / transport._reconnectAttempt <
MAX_RECONNECT_ATTEMPTS; update scheduleReconnect's callers accordingly so the
single source of truth for reconnect gating is the scheduleReconnect function
(referenced symbols: scheduleReconnect, willReconnect,
transport._reconnectTimer, transport.currentUrl, transport._terminal,
transport._reconnectAttempt, MAX_RECONNECT_ATTEMPTS, terminal.writeln,
formatWsEndpoint, formatCloseDetails).
- Around line 201-206: The error handler currently always writes a terminal line
and causes a duplicate message because a subsequent close also logs; instead,
dedupe by having the error handler set a transient flag (e.g.,
transport._sawSocketError) and NOT call terminal.writeln, then have the existing
socket "close" handler consult that flag and emit a single consolidated
terminal.writeln using formatWsEndpoint(transport.currentUrl); ensure you clear
the flag after use so future connections behave normally (update
socket.addEventListener("error", ...), the transport object, and the socket
close handler accordingly).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f7ee8110-e8fe-4cfe-9617-fec5c95cce07

📥 Commits

Reviewing files that changed from the base of the PR and between d478bbc and a290a00.

📒 Files selected for processing (3)
  • apps/desktop/src/renderer/lib/terminal/terminal-ws-transport.ts
  • apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/$workspaceId/hooks/usePaneRegistry/components/TerminalPane/TerminalPane.tsx
  • packages/host-service/src/terminal/terminal.ts

@Kitenite Kitenite merged commit 2473996 into main Apr 27, 2026
12 checks passed
@Kitenite Kitenite deleted the debug-terminal-connection branch April 27, 2026 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant