Skip to content

fix(app): reduce spurious SSE reconnects after sleep/resume#17770

Open
BYK wants to merge 1 commit intoanomalyco:devfrom
BYK:fix/sleep-resume-reconnect
Open

fix(app): reduce spurious SSE reconnects after sleep/resume#17770
BYK wants to merge 1 commit intoanomalyco:devfrom
BYK:fix/sleep-resume-reconnect

Conversation

@BYK
Copy link

@BYK BYK commented Mar 16, 2026

Issue for this PR

Closes #17769

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

After a device resumes from sleep (closing laptop lid, phone lock screen, browser tab backgrounded), the web UI often shows stale session state because the SSE connection is torn down and rebuilt unnecessarily.

Two changes:

1. Increase heartbeat timeout from 15s to 30s

The server sends heartbeats every 10s. With a 15s client timeout, there was only 5s of margin — under load or network jitter, a heartbeat could arrive late and trigger a premature disconnect. At 30s there is a comfortable 20s buffer while still detecting genuinely dead connections within a reasonable window.

2. Reset heartbeat timer on visibility change instead of aborting immediately

Previously, returning to a backgrounded tab after >15s would instantly attempt.abort() the SSE stream and trigger a full reconnect cycle (re-bootstrap all directories). This caused unnecessary work and a brief period of stale UI.

Now the visibilitychange handler resets the heartbeat timer instead of aborting. The stream gets one full heartbeat interval to prove it's still alive. If no event arrives within the timeout, the timer fires and aborts — same end result for genuinely dead connections, but no false positives for connections that survived the background period.

How did you verify your code works?

  • Tested on mobile Safari (iOS) and Chrome Android by locking/unlocking device
  • Tested on desktop by switching away from tab for 30s+ then returning
  • Verified SSE stream stays connected when the underlying TCP connection survives sleep
  • Verified dead connections still get detected and reconnected (just takes up to 30s instead of 15s)

Screenshots / recordings

N/A — behavioral change in connection management.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

@BYK BYK requested a review from adamdotdevin as a code owner March 16, 2026 09:51
@github-actions
Copy link
Contributor

The following comment was made by an LLM, it may be inaccurate:

Based on my search results, I found one potentially related PR:

Related PR:

The current PR (17770) appears to be the primary fix for issue #17769. The other search results mostly returned PR 17770 itself or unrelated connection management PRs.

@BYK
Copy link
Author

BYK commented Mar 16, 2026

Thanks for the pointer — #15436 tackles the same sleep/resume disconnect problem for the TUI event stream. Our PR here is the web app counterpart: the TUI uses a separate SSE consumer (tui/context/sdk.tsx → worker relay), while the web app uses global-sdk.tsx with its own reconnect loop.

The two fixes are complementary — #15436 adds heartbeat timeout + reconnect to the TUI path, this PR fixes the web app path where the heartbeat timeout was too tight (15s vs 10s server interval) and the visibility handler was too aggressive (immediate abort instead of giving the stream a chance to recover).

Both PRs share the same root cause: after sleep/resume the SSE connection is stale but the client either doesn't detect it quickly enough or detects it too aggressively, leading to frozen sessions or unnecessary reconnection churn.

Two changes to improve connection stability when a device resumes from
sleep or a browser tab returns from background:

1. Increase HEARTBEAT_TIMEOUT_MS from 15s to 30s. The server sends
   heartbeats every 10s, so 15s left only 5s of margin — under load or
   network jitter a heartbeat could arrive late and trigger a premature
   disconnect. 30s provides a comfortable 20s buffer while still
   detecting genuinely dead connections.

2. On visibilitychange, reset the heartbeat timer instead of immediately
   aborting. Previously, returning to a backgrounded tab after >15s would
   instantly abort the SSE connection and trigger a full reconnect cycle
   (re-bootstrap all directories). Now the stream gets one full heartbeat
   interval to prove it's still alive. If no event arrives within the
   timeout, the timer fires and aborts — same end result but without
   false positives for connections that are still healthy.
@BYK BYK force-pushed the fix/sleep-resume-reconnect branch from b4ff5f5 to 6c9ae2b Compare March 16, 2026 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Session state stale after device sleep/resume — heartbeat mismatch causes premature SSE disconnect

1 participant