fix(host-service): catch tunnel connect errors to prevent crash on DNS failure#3861
Conversation
Greptile SummaryWraps the body of Confidence Score: 5/5Safe to merge — the change is a targeted, low-risk crash fix with no logic regressions. Single-file change with a clear, well-bounded fix. All error paths now funnel through the existing scheduleReconnect() which has a guard against double-scheduling. No new state is introduced. The only finding is a P2 logging suggestion. No files require special attention.
|
| Filename | Overview |
|---|---|
| packages/host-service/src/tunnel/tunnel-client.ts | Wraps connect() body in try/catch to route DNS/fetch errors through scheduleReconnect() instead of crashing the process; logic is sound and the guard in scheduleReconnect() prevents double-scheduling. |
Sequence Diagram
sequenceDiagram
participant SR as scheduleReconnect()
participant C as connect()
participant GA as getAuthToken()
participant WS as WebSocket
SR->>C: void this.connect() [after delay]
Note over C: try {
C->>GA: await getAuthToken()
alt DNS failure / network error
GA-->>C: throws ENOTFOUND
Note over C: } catch (error) {
C->>C: console.error("connect failed: ...")
C->>C: this.socket = null
C->>SR: scheduleReconnect()
else token is null
GA-->>C: returns null
C->>C: console.warn("no auth token")
C->>SR: scheduleReconnect()
else success
GA-->>C: returns token
C->>WS: new WebSocket(url)
WS-->>C: socket connected (onopen)
C->>C: reconnectAttempts = 0
alt socket error / close
WS-->>C: onclose fired
C->>SR: scheduleReconnect()
end
end
Prompt To Fix All With AI
This is a comment left during a code review.
Path: packages/host-service/src/tunnel/tunnel-client.ts
Line: 88-92
Comment:
**Log the full error for better diagnostics**
Only `error.message` is logged here, which strips the `[cause]` chain that carries the root cause (e.g. `getaddrinfo ENOTFOUND api.superset.sh`). Passing the original error object as a second argument to `console.error` preserves the full stack and cause chain in the log file.
```suggestion
} catch (error) {
const message = error instanceof Error ? error.message : String(error);
console.error(`[host-service:tunnel] connect failed: ${message}`, error);
this.socket = null;
this.scheduleReconnect();
}
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix(host-service): catch tunnel connect ..." | Re-trigger Greptile
| } catch (error) { | ||
| const message = error instanceof Error ? error.message : String(error); | ||
| console.error(`[host-service:tunnel] connect failed: ${message}`); | ||
| this.socket = null; | ||
| this.scheduleReconnect(); |
There was a problem hiding this comment.
Log the full error for better diagnostics
Only error.message is logged here, which strips the [cause] chain that carries the root cause (e.g. getaddrinfo ENOTFOUND api.superset.sh). Passing the original error object as a second argument to console.error preserves the full stack and cause chain in the log file.
| } catch (error) { | |
| const message = error instanceof Error ? error.message : String(error); | |
| console.error(`[host-service:tunnel] connect failed: ${message}`); | |
| this.socket = null; | |
| this.scheduleReconnect(); | |
| } catch (error) { | |
| const message = error instanceof Error ? error.message : String(error); | |
| console.error(`[host-service:tunnel] connect failed: ${message}`, error); | |
| this.socket = null; | |
| this.scheduleReconnect(); | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/host-service/src/tunnel/tunnel-client.ts
Line: 88-92
Comment:
**Log the full error for better diagnostics**
Only `error.message` is logged here, which strips the `[cause]` chain that carries the root cause (e.g. `getaddrinfo ENOTFOUND api.superset.sh`). Passing the original error object as a second argument to `console.error` preserves the full stack and cause chain in the log file.
```suggestion
} catch (error) {
const message = error instanceof Error ? error.message : String(error);
console.error(`[host-service:tunnel] connect failed: ${message}`, error);
this.socket = null;
this.scheduleReconnect();
}
```
How can I resolve this? If you propose a fix, please make it concise.|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/host-service/src/tunnel/tunnel-client.ts`:
- Around line 50-64: After awaiting getAuthToken(), re-check the instance closed
flag so you don't open a socket after close() runs: inside the same method (the
block that calls getAuthToken()), add an immediate guard like "if (this.closed)
return;" after the await (before constructing the URL / new WebSocket) and
ensure you do not call scheduleReconnect or create this.socket when closed; this
prevents creating a live WebSocket on a client that was shut down (references:
getAuthToken, close, scheduleReconnect, this.closed, this.socket).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: da4a21aa-f6b3-4bec-82dc-612c0d2945fe
📒 Files selected for processing (1)
packages/host-service/src/tunnel/tunnel-client.ts
| try { | ||
| const token = await this.getAuthToken(); | ||
| if (!token) { | ||
| console.warn("[host-service:tunnel] no auth token available, retrying"); | ||
| this.scheduleReconnect(); | ||
| return; | ||
| } | ||
| }; | ||
|
|
||
| socket.onerror = (event) => { | ||
| console.error("[host-service:tunnel] socket error:", event); | ||
| }; | ||
| const url = new URL("/tunnel", this.relayUrl); | ||
| url.protocol = url.protocol === "https:" ? "wss:" : "ws:"; | ||
| url.searchParams.set("hostId", this.hostId); | ||
| url.searchParams.set("token", token); | ||
|
|
||
| const socket = new WebSocket(url.toString()); | ||
| this.socket = socket; |
There was a problem hiding this comment.
Re-check this.closed after the awaited token fetch.
close() can run while getAuthToken() is in flight. In that case this method still creates a new relay socket after shutdown, because the only closed guard is before Line 51. That leaves a live connection behind a client that was already closed.
Suggested fix
try {
const token = await this.getAuthToken();
+ if (this.closed) {
+ return;
+ }
+
if (!token) {
console.warn("[host-service:tunnel] no auth token available, retrying");
this.scheduleReconnect();
return;
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/host-service/src/tunnel/tunnel-client.ts` around lines 50 - 64,
After awaiting getAuthToken(), re-check the instance closed flag so you don't
open a socket after close() runs: inside the same method (the block that calls
getAuthToken()), add an immediate guard like "if (this.closed) return;" after
the await (before constructing the URL / new WebSocket) and ensure you do not
call scheduleReconnect or create this.socket when closed; this prevents creating
a live WebSocket on a client that was shut down (references: getAuthToken,
close, scheduleReconnect, this.closed, this.socket).
🚀 Preview Deployment🔗 Preview Links
Preview updates automatically with new commits |
…S failure When the laptop wakes from sleep and DNS hasn't recovered, the tunnel's auto-reconnect path calls getAuthToken() which fetches api.superset.sh. The fetch throws ENOTFOUND, the rejection escapes via `void this.connect()`, and the host-service process crashes — orphaning every PTY. The coordinator respawns on a new port, but the renderer keeps reconnecting to the dead port until it gives up. Wrap connect()'s body in try/catch and route any throw back through scheduleReconnect, so transient network failures behave the same as WebSocket socket-level errors.
9443b87 to
4b16dab
Compare
…S failure (superset-sh#3861) When the laptop wakes from sleep and DNS hasn't recovered, the tunnel's auto-reconnect path calls getAuthToken() which fetches api.superset.sh. The fetch throws ENOTFOUND, the rejection escapes via `void this.connect()`, and the host-service process crashes — orphaning every PTY. The coordinator respawns on a new port, but the renderer keeps reconnecting to the dead port until it gives up. Wrap connect()'s body in try/catch and route any throw back through scheduleReconnect, so transient network failures behave the same as WebSocket socket-level errors.
Summary
JwtApiAuthProvider.getJwt()which fetchesapi.superset.sh. The fetch throwsENOTFOUND, the rejection escapes viavoid this.connect()(scheduleReconnect'ssetTimeout), and the host-service process crashes — orphaning every PTY.TunnelClient.connect()'s body in try/catch. Any throw (DNS failure, WebSocket constructor error, etc.) now routes through the samescheduleReconnect()path that handles normal socket-level errors.Evidence
Reproduced in
~/.superset/host/<org>/host-service.log:Followed immediately by a fresh
[host-service:db] Initializedline — the coordinator-driven respawn that breaks the renderer's terminal sockets. No[host-service] unhandledRejection — staying uplog was emitted, so the safety net insafety.tsdidn't catch it (likely Node 24 default--unhandled-rejections=throwbehavior on this path).Test plan
sudo ifconfig en0 downor unplug Ethernet) while a terminal is openpgrep -fl host-service.jsshows the same pid before and afterhost-service.logshows[host-service:tunnel] connect failed: ...followed byreconnecting in Nms (attempt N)— no Node crash footerconnected to relay for host ...) and existing terminal WebSockets keep working@superset/host-service(verified locally)Summary by cubic
Prevent
@superset/host-servicefrom crashing on DNS/network failures during tunnel reconnect by catching connect errors and routing them through the backoff, so terminals survive sleep.TunnelClient.connect()in try/catch; on any error (auth token fetch, WebSocket init), log[host-service:tunnel] connect failed: ..., clear the socket, and callscheduleReconnect().Written for commit 4b16dab. Summary will update on new commits. Review in cubic
Summary by CodeRabbit