-
Notifications
You must be signed in to change notification settings - Fork 965
harden(host-service): connect-phase deadline + reconnect-guaranteed onclose #4539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -11,6 +11,7 @@ const RECONNECT_BASE_MS = 1_000; | |
| const RECONNECT_MAX_MS = 30_000; | ||
| const INBOUND_SILENCE_TIMEOUT_MS = 75_000; | ||
| const WATCHDOG_INTERVAL_MS = 10_000; | ||
| const CONNECT_TIMEOUT_MS = 20_000; | ||
|
|
||
| export interface TunnelClientOptions { | ||
| relayUrl: string; | ||
|
|
@@ -59,15 +60,32 @@ export class TunnelClient { | |
| } | ||
| this.connecting = true; | ||
|
|
||
| let timedOut = false; | ||
| const deadline = setTimeout(() => { | ||
| if (this.closed) return; | ||
| timedOut = true; | ||
| console.warn( | ||
| `[host-service:tunnel] connect did not complete within ${CONNECT_TIMEOUT_MS}ms, forcing retry`, | ||
| ); | ||
| try { | ||
| this.socket?.close(4001, "Connect timeout"); | ||
| } catch {} | ||
| this.socket = null; | ||
| this.connecting = false; | ||
| this.scheduleReconnect(); | ||
| }, CONNECT_TIMEOUT_MS); | ||
|
|
||
| // An unhandled rejection here (e.g. DNS failure inside getAuthToken on | ||
| // wake from sleep) crashes host-service and orphans every PTY. | ||
| try { | ||
| const token = await this.getAuthToken(); | ||
| if (this.closed) { | ||
| this.connecting = false; | ||
| if (timedOut || this.closed) { | ||
| clearTimeout(deadline); | ||
| if (this.closed) this.connecting = false; | ||
| return; | ||
| } | ||
|
Comment on lines
+82
to
86
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
When Prompt To Fix With AIThis is a comment left during a code review.
Path: packages/host-service/src/tunnel/tunnel-client.ts
Line: 82-86
Comment:
**Asymmetric `connecting` reset creates an implicit dependency on the deadline side-effect**
When `timedOut` is true but `this.closed` is false, `this.connecting` is not explicitly reset here — the assumption is that the deadline callback already set it to `false`. This differs from the `this.closed` branch directly below, which resets it explicitly. While correct today (the deadline atomically sets `timedOut = true` and then `this.connecting = false` with no yield between them), a future refactor that separates those two operations — or moves toward a state-machine model — could silently re-introduce the wedge. An explicit `this.connecting = false` for the `timedOut` branch would make the invariant self-contained and consistent.
How can I resolve this? If you propose a fix, please make it concise. |
||
| if (!token) { | ||
| clearTimeout(deadline); | ||
| console.warn("[host-service:tunnel] no auth token available, retrying"); | ||
| this.connecting = false; | ||
| this.scheduleReconnect(); | ||
|
|
@@ -84,6 +102,7 @@ export class TunnelClient { | |
| this.lastInboundAt = Date.now(); | ||
|
|
||
| socket.onopen = () => { | ||
| clearTimeout(deadline); | ||
| this.reconnectAttempts = 0; | ||
| this.connecting = false; | ||
| this.lastInboundAt = Date.now(); | ||
|
|
@@ -99,28 +118,34 @@ export class TunnelClient { | |
| }; | ||
|
|
||
| socket.onclose = (event) => { | ||
| const wasOurSocket = this.socket === socket; | ||
| if (!wasOurSocket) return; | ||
|
|
||
| this.socket = null; | ||
| this.connecting = false; | ||
| this.stopWatchdog(); | ||
| this.cleanupChannels(); | ||
|
|
||
| if (this.closed) return; | ||
|
|
||
| if (event.code === 1008) { | ||
| if (this.socket !== socket) return; | ||
| clearTimeout(deadline); | ||
| try { | ||
| this.socket = null; | ||
| this.connecting = false; | ||
| this.stopWatchdog(); | ||
| this.cleanupChannels(); | ||
| if (event.code === 1008) { | ||
| console.warn( | ||
| `[host-service:tunnel] relay rejected connection (code=${event.code}, reason=${event.reason ?? ""}); retrying`, | ||
| ); | ||
| } | ||
| } catch (err) { | ||
| console.warn( | ||
| `[host-service:tunnel] relay rejected connection (code=${event.code}, reason=${event.reason ?? ""}); retrying`, | ||
| "[host-service:tunnel] error during onclose cleanup", | ||
| err, | ||
| ); | ||
| } finally { | ||
| if (!this.closed) this.scheduleReconnect(); | ||
| } | ||
| this.scheduleReconnect(); | ||
| }; | ||
|
|
||
| socket.onerror = (event) => { | ||
| console.error("[host-service:tunnel] socket error:", event); | ||
| }; | ||
| } catch (error) { | ||
| clearTimeout(deadline); | ||
| if (timedOut) return; | ||
| const message = error instanceof Error ? error.message : String(error); | ||
| console.error(`[host-service:tunnel] connect failed: ${message}`); | ||
| this.socket = null; | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this, soclose()cannot cancel itdeadlineis a local variable, unlikereconnectTimerandwatchdogTimerwhich are stored on the instance and cleared inclose(). Whenclose()is called while aconnect()is mid-flight,onclosefires with the identity check returning early (becauseclose()already nulledthis.socket), soclearTimeout(deadline)is never reached. The timer then lives for up to 20 s before firing and short-circuiting onif (this.closed) return. No incorrect behaviour results, but it is an orphaned async resource that is inconsistent with the rest of the timer-management pattern and could confuse tests that assert no pending timers remain afterclose().Prompt To Fix With AI