feat(sse): reconnect handler replays buffered events on lastSeenSeq (B7.2)#32685
Conversation
…B7.2) GET /v1/events now accepts an optional lastSeenSeq query param. When a client reconnects with a cursor scoped to a single conversation, the route handler drains the per-conversation ring buffer for events with seq > lastSeenSeq before emitting the first heartbeat. When the cursor is older than the ring's oldest entry, a single stream_resync_required event is emitted so the client can fetch a snapshot via the normal messages API and resume live from the next event. A high-water dedup watermark on the live callback drops any event that races into the subscription with seq <= the largest replayed seq -- broadcastMessage stamps and rings BEFORE publish, so in-flight events mid-replay are guaranteed to already be in the window we just drained. Adds the stream_resync_required ServerMessage variant in a new message-types/stream.ts module, wired into the ServerMessage union via message-protocol.ts. The resync event is emitted directly into the reconnecting subscriber's stream (never via broadcastMessage), carries no seq, and is never fanned out to other subscribers. 7 new tests cover: in-window replay, snapshot-resync fallback when the ring has evicted past the cursor, omitted-param legacy live-only path, dedup against a live event that duplicates a replayed seq, and three malformed-param rejections (empty / non-integer / negative). Follow-up (separate PR, noted on #32676): revisit the replayable:false behavior for targeted events so the intended recipient of a targeted publish doesn't miss it on reconnect. The right fix is to store the publish-time targeting metadata alongside each ring entry and re-apply the filter at replay time.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e967302456
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const window = getReplayWindow(replayConversationId, lastSeenSeq); | ||
| if (window === null) { |
There was a problem hiding this comment.
Signal resync when replay state is missing
When the assistant process restarts, or a conversation's in-memory ring has aged out and been deleted before any new event, getReplayWindow() returns [] rather than null. This path treats that as an in-window replay, sends only the heartbeat, and leaves a reconnecting client holding its old lastSeenSeq; subsequent events for the same persisted conversation restart at low seq values, so replay-aware clients have no resync signal and can miss or discard live updates. Please turn a missing/empty replay state with a nonzero cursor into stream_resync_required before going live.
Useful? React with 👍 / 👎.
… seq jump Per review feedback: the daemon doesn't need to surface a special 'cursor too old' signal over the wire. The client already has snapshot refetch paths (messages API), and it can detect the gap purely on its own by comparing the seq of the first live event after reconnect against its persisted lastSeenSeq. Removing the message type shrinks the protocol surface and keeps the resync-from-DB policy entirely client-side where the state machine for 'should I refetch?' is more natural. Changes: - delete src/daemon/message-types/stream.ts and unwire from message-protocol.ts (no new ServerMessage variant) - in /v1/events reconnect handler, when getReplayWindow returns null (cursor older than ring's oldest), do nothing -- connection goes live as if no cursor was passed - buildAssistantEvent import on the routes file becomes unused; removed - replace the 'snapshot-resync signal' test with one that asserts the cursor-too-old path connects live without any extra frame ahead of the heartbeat - update lastSeenSeq query param description in OpenAPI Adjacent regression sweep: 82/82 SSE/hub/stream-state/framing tests still green.
What
Server-side reconnect handler for the
/v1/eventsSSE stream. The route now accepts an optionallastSeenSeqquery param. When a client reconnects with a cursor scoped to a single conversation, the daemon replays any buffered events withseq > lastSeenSeqbefore going live. If the cursor is older than the ring's oldest entry, the connection just goes live — the client detects the gap from the next event's seq and refetches via the existing messages API.This is Unit 2 of B7. Builds on the per-conversation
seqstamping + bounded ring buffer landed in #32676 (B7.1). Unit 3 (the client side — persistlastSeenSeqin localStorage, send via query param on reconnect, detect seq jumps and refetch) is the next PR.How
Request shape
lastSeenSeqis a non-negative integer. Empty / non-integer / negative values return 400. Omitting the param falls through to the existing live-only behavior — no replay.Replay drain in
start()When
lastSeenSeqis set and the subscription is scoped to a conversation, the handler callsgetReplayWindow(conversationId, lastSeenSeq)before the first heartbeat:highWaterReplaySeq.null→ cursor is older than the ring's oldest entry. Do nothing extra; connection just goes live. The client is expected to detect the gap from the seq jump on its first live event and refetch via the messages API.[]→ cursor is in window but nothing newer; no-op.Dedup against the live callback
broadcastMessagestamps and rings before publish, so any event publish-racing with the replay drain is guaranteed to be in the ring window we just drained. The live callback now drops any event whoseseq <= highWaterReplaySeqto avoid double-delivery.No new message type
No new
ServerMessagevariant — the protocol stays exactly as it was. The "I missed too much" detection lives entirely on the client side where the state machine for "should I refetch?" is more natural.Tests
7 new tests in
runtime-events-sse-reconnect.test.ts:lastSeenSeq=1; stream emits seq=2, seq=3, then heartbeat.lastSeenSeq=0; first frame is the heartbeat (no extra signal, no replay).lastSeenSeqlegacy live-only: ring has events; reconnect without cursor; stream emits heartbeat first.5–7. Malformed params: empty, non-integer (
1.5), negative (-1) →BadRequestError.Adjacent regression sweeps stay green:
Follow-up (separate PR)
Noted on #32676 — the current
replayable: falsedesign (introduced in B7.1) skips ring buffering for targeted events. This means the intended recipient of a targeted publish will miss the event on reconnect. The right long-term fix is to store the publish-time targeting metadata (targetClientId,targetInterfaceId,targetCapability,excludeClientId) alongside each ring entry and re-apply the same filter at replay time. Out of scope here; tracked separately.Out of scope
lastSeenSeqmap in localStorage, send the cursor on reconnect, detect seq jump and trigger refetch). Lands in B7.3.nextSeq. Not needed for the mid-turn refresh case.