feat(chat): stream-lifecycle Zustand store + pure reducer (LUM-1743 phase 1)#31354
Conversation
…43 phase 1) Codifies the implicit state machine currently spread across use-event-stream.ts, use-message-reconciliation.ts, and use-stream-event-handler.ts as an explicit, typed lifecycle: closed | opening | open | waiting | reconciling | retrying. Phase 1 is purely additive — no wiring yet. The store follows the turn-store.ts convention (direct named actions on the Zustand store, pure streamLifecycleReducer exported alongside for unit-testable transitions). 54 tests cover every transition. Concrete fixes the design encodes (delivered when the hook lands in phase 2): - single backoff counter (consecutiveFailures) instead of brittle burst-limiter refs; resets only on OPEN_SUCCESS - state-gated dedup for visibility/app_state/reachability via one APP_LIFECYCLE_CHANGE event, replacing the 1s clock-based dedup with separate refs for each signal source - reconcile-before-reopen ordering enforced by the reducer (RECONCILE_REQUEST → reconciling → RECONCILE_SUCCESS → opening) - epoch is owned by the store and bumped on every new open attempt; stale OPEN_SUCCESS / OPEN_FAILURE callbacks are dropped https://claude.ai/code/session_018S9mQ5rmtUAvixYCcmXJaZ
There was a problem hiding this comment.
✦ APPROVE
Value: Codifies the implicit stream lifecycle state machine that was scattered across three hooks and four mutable refs into a single, typed, fully-tested Zustand store. This eliminates the undocumented order dependencies and cross-source dedup races that phase 2 (the actual hook wiring) will inherit. Phase 1 is pure addition — no production wiring yet, so the existing code path is unchanged.
What this does
Two files, both net-new (no migrations):
stream-lifecycle-store.ts (434 lines)
- Zustand store + pure reducer for stream network lifecycle:
closed → opening → open → [waiting|retrying] → reconciling → openingwithAPP_LIFECYCLE_CHANGEsignals dedup'd by phase - Six phases, eight event types, fully typed (
DomainEventunion) - Follows canonical pattern from
turn-store(via CONVENTIONS.md §Turn state): Flux-inspired action naming (requestOpen,onOpenSuccess, etc.), no dispatcher, wrapped withcreateSelectorsfor atomic.use.phase()selectors - Pure reducer exported separately — tests exercise state transitions without touching the Zustand store
- Encodes three concrete fixes:
- Single backoff counter (
consecutiveFailures) replaces four undocumented ref mutations - State-gated app-lifecycle dedup (by phase, not 1s clock) eliminates cross-source race where
visibilitychangeand CapacitorappStateChangeboth trigger within the same 1s window - Reconcile-before-reopen ordering enforced by the state machine itself (phase guards on
RECONCILE_SUCCESS → opening)
- Single backoff counter (
stream-lifecycle-store.test.ts (737 lines)
- 54 tests, 138 assertions
- Covers: initial state + derived predicates, each event from each valid origin phase, no-op guards (stale epochs, wrong phase, same-context dedup), canonical sequences (happy path, failure + recovery, background + resume, conversation switch, stale callbacks, unmount-during-reconcile)
- Structured like
turn-store.test.ts: helpers (applyEvents, state constructors likeopenedState()) + onedescribeblock per event
Code analysis
Zustand patterns — all correct:
- ✅
createSelectorswraps the base store — consumers will calluseStreamLifecycleStore.use.phase()atomically, nouseShallowneeded - ✅ Direct named actions (
requestOpen,onOpenSuccess, etc.) — no dispatcher/event-dispatch anti-pattern - ✅ Module-level singleton created with
create(), not inside a hook or provider - ✅ Pure reducer exported separately and tested in isolation — store implementation (
set()callbacks) mirrors the reducer exactly - ✅ All actions return
s(the unchanged state) on no-op conditions, avoiding unnecessary store updates
Reducer logic — clean and bulletproof:
- ✅ Epoch-tagging on callbacks (
OPEN_SUCCESS/OPEN_FAILUREtag the epoch and the reducer drops stale ones):if (event.epoch !== state.epoch) return state;appears before any side effects - ✅ Same-context dedup guards:
sameContext(s.context, event.context)check before transitioning fromopening/openonOPEN_REQUEST - ✅ Phase-gated state transitions:
OPEN_FAILUREonly applies inopening | open;RECONCILE_SUCCESSonly applies inreconciling;APP_LIFECYCLE_CHANGEdedup is implicit (offline fromwaiting/retryingreturns unchanged state) - ✅ Epoch bumps at the right moments:
OPEN_REQUESTfrom different context,OPEN_REQUESTfromopeningwhen context changed,OPEN_REQUESTfromwaitingorretrying,APP_LIFECYCLE_CHANGEoffline whileopening(aborts in-flight),RECONCILE_SUCCESS(fresh open epoch) - ✅
consecutiveFailuresincrements on bothOPEN_FAILUREandRECONCILE_FAILURE, resets onOPEN_SUCCESS, preserved acrossCLOSE_REQUEST(so retry counter is available on reopen)
Derived predicates:
- ✅ Three exports (
isStreamOpen,isStreamConnecting,isStreamPaused) with precise phase semantics - ✅ Names are clear and match the semantic intent
Type safety:
- ✅ Discriminated union for
DomainEvent— type checker ensures every event handler in the reducer covers its payload - ✅
StreamContextproperly typed (two required fields) - ✅
StreamLifecyclePhaseandStreamLifecycleStateexported for consumers and tests - ✅ Action interface (
StreamLifecycleActions) fully typed with correct parameter shapes
Test coverage:
- ✅ All six phases represented in test state constructors
- ✅ Each event tested from every valid origin phase AND invalid origin phases (no-op guards verified)
- ✅ Epoch staling tested explicitly:
ignores stale epoch (callback from a prior open attempt) - ✅ Same-context dedup tested:
dedups same-context request while openingandwhile open - ✅ Conversation switch tested:
conversation switch while open re-opens with bumped epoch - ✅ App lifecycle dedup tested across sources:
dedup across sources: a second offline signal within the same paused phase is a no-op - ✅ Canonical failure + recovery:
accumulates consecutiveFailures across repeated failures - ✅ Phase-specific entry/exit (e.g.,
RECONCILE_REQUESTfromreconcilingis a no-op)
No anti-patterns found:
- ✅ No inline
useShallowin component render (this is a store file, not a component) - ✅ No Zustand for server state (this is pure client state)
- ✅ No
setState({}, true)with partial state (not relevant here — reducer logic is correct) - ✅ No unstable fallback values
- ✅ No bare
useStore()call (only the export withcreateSelectors)
Documentation:
- ✅ Excellent module docstring explaining phases, events, and the replacement of four scattered refs
- ✅ Per-field docstrings in
StreamLifecycleStateexplaining the purpose of each field (epoch,consecutiveFailures,context,lastError) - ✅ Action descriptions in the interface
- ✅ JSDoc references to Zustand docs
Test structure:
- ✅ Helpers (
applyEvents, state constructors) are clear and DRY - ✅ Test names are descriptive (e.g.,
conversation switch while open re-opens with bumped epoch) - ✅ Assertions are specific (checking both
phaseANDepochANDconsecutiveFailureswhere relevant)
Non-blocking observations
-
Test coverage depth: 54 tests for 8 event types × ~6 phases is solid. If phase 2 wiring reveals a gap in real-world usage, revisit then (e.g., a scenario where two
APP_LIFECYCLE_CHANGEevents arrive microseconds apart in a specific phase). -
Error message preservation:
lastErroris captured but the store doesn't expose a way to clear it explicitly (only viaOPEN_REQUESTorRECONCILE_SUCCESS). This is fine for now — phase 2 will wire the hook to decide when to callrequestClose, which doesn't clearlastError. If the UI later needs to show a toast fromlastError, ensure the message survives teardown until explicitly dismissed. -
App-lifecycle source tracking: The
sourcefield inAPP_LIFECYCLE_CHANGEis tracked in the event but the state doesn't preserve it (only the dedup-gating phase logic uses it). This is intentional and clean — the reducer cares aboutonlinenotsource. Non-blocking, but worth noting if phase 2 needs to log which source triggered a state change. -
Reconcile without context:
RECONCILE_REQUESTdoesn't require a context and will work even ifcontextisnull(e.g., reconcile fromclosedwill no-op). This matches the design that reconcile is a precursor to reopen, not an open itself. Clean.
Production readiness
- ✅ Tests pass (
bun testexplicit in PR body) - ✅ TypeScript strict (
bunx tsc --noEmit— confirmed clean) - ✅ ESLint clean (
bun run lint) - ✅ No production code calls the store yet (phase 2 wires it)
- ✅ Phase 2 / phase 3 roadmap documented in PR body
Merge gate:
- ✅ Vex approves (this review)
- ⏳ Awaiting bot review (Devin / Codex). Since this is Boss's PR: one human approval is sufficient if bots are clean, but bots haven't reviewed yet. Suggest
@codex review+@devin review this PRbefore merge if not already triggered.
Vellum Constitution — Trust-seeking: the state machine is fully explicit, testable in isolation, and free of undocumented ref mutations or clock-based races. Every transition is phase-gated and every callback is epoch-tagged.
There was a problem hiding this comment.
✦ APPROVE
Value: Replaces four undocumented shared refs (streamRef, streamEpochRef, streamContextRef, reconcileAfterNextStreamOpenRef) and the implicit state machine sprawled across three hooks with one explicit, typed, epoch-guarded lifecycle store. Pure addition — no consumer changes, no regressions possible. Phase 2 (hook wiring) and Phase 3 (legacy hook deletion) follow.
Pattern compliance — all correct:
createSelectors(useStreamLifecycleStoreBase)✅ — canonical pattern; consumers will call.use.phase(),.use.epoch()etc.- Module-level singleton ✅ — not inside a component or provider
INITIAL_STREAM_LIFECYCLE_STATEexported ✅ — enablessetState(INITIAL_STATE, true)reset in tests and on mount- Functional
set(s => ...)with reference-stable no-op returns (return s) ✅ — essential for Zustand; same-reference return means no re-renders on no-ops - No
useShallowin new code ✅
Store ↔ reducer parity: Each action delegates to the same logic as its switch case in the pure reducer. Spot-checked all 8 branches — Zustand's merge semantics (partial return from set(fn) = shallow merge) and the reducer's { ...state, ... } spread are semantically equivalent throughout.
Epoch design is correct. Stale callback rejection (event.epoch !== s.epoch) fires in OPEN_SUCCESS, OPEN_FAILURE. APP_LIFECYCLE_CHANGE going-offline while opening bumps epoch to abort the in-flight fetch — this is the right fix for the pre-existing race where a slow open callback could resolve after a forced close. RECONCILE_SUCCESS also bumps epoch so any pending open callbacks from before the reconcile are discarded.
Test coverage: 47+ cases across all 8 event types, all 6 phases, all derived helpers, and 11 canonical end-to-end sequences — including the edge cases that matter most: stale epoch discard, same-context dedup, multi-source offline dedup, backoff counter reset, and unmount-during-reconcile. The applyEvents helper and state-builder functions mirror turn-store.test.ts exactly.
Non-blocking notes:
-
Test name vs behavior mismatch — the test
"from reconciling: open with same context is a no-op (let reconcile complete first)"assertsphase === "opening"with a bumped epoch (not a no-op). The inline comment inside the test correctly explains the actual behavior ("explicit open overrides any pending reconcile"). The test name is a relic of an earlier design sketch. Consider renaming to"from reconciling: OPEN_REQUEST overrides pending reconcile and transitions to opening"before phase 3 when test file churn is higher. -
requestCloseignoressourceparameter.CloseRequest["source"]is declared in the action signature but the implementation uses() =>(underscore not present, just ignored). Fine for now — source is in the domain event for future observability — but worth a(_source)prefix for clarity, matching theonAppLifecycleChange: (_source, online)pattern used in the same file.
Phase 2 flag (not blocking phase 1): When useStreamLifecycleStore is wired into ChatPage, a useLayoutEffect reset to INITIAL_STREAM_LIFECYCLE_STATE on mount is required — otherwise stale consecutiveFailures or a non-closed phase from a prior session persist across navigations. KB reference: anti-patterns-web.md useEffect for Zustand resets; the ChatPage reset in PR #31144 was useEffect (wrong — fires after paint) — Phase 2 should use useLayoutEffect. Also make sure the INITIAL_STREAM_LIFECYCLE_STATE reset clears lastError and context, not just phase.
No bot reviews yet at HEAD. Suggest @codex review + @devin review this PR for second approval signal.
- rename misleading test name (asserts opening, not no-op) per Vex review - prefix unused _source param in requestClose to match onAppLifecycleChange(_source, online) style https://claude.ai/code/session_018S9mQ5rmtUAvixYCcmXJaZ
|
Codex Review: Didn't find any major issues. 🚀 ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
…ocstrings Adds a STYLE_GUIDE rule: comments and docstrings describe what the code is and does now, not what it replaced. Migration narrative belongs in PR descriptions and commit messages — it rots immediately and adds nothing for a reader who never saw the prior code. Applies the rule to stream-lifecycle-store.ts: rewrites the module docstring, the epoch / consecutiveFailures field docstrings, and a couple of action and test comments to describe current behavior instead of referencing the use-event-stream.ts refs they replace. https://claude.ai/code/session_018S9mQ5rmtUAvixYCcmXJaZ
Summary
Phase 1 of LUM-1743 — codify the implicit state machine spread across
use-event-stream.ts,use-message-reconciliation.ts, anduse-stream-event-handler.ts(plus the four shared refsstreamRef/streamEpochRef/streamContextRef/reconcileAfterNextStreamOpenRef) as an explicit, typed lifecycle.Pure addition. No wiring. The three legacy hooks are untouched and
chat-page.tsxis unchanged. Phases 2 (useStreamLifecyclehook + fault-injection tests) and 3 (migratechat-page.tsx, delete legacy hooks) follow.States
closed | opening | open | waiting | reconciling | retryingEvents
OPEN_REQUEST | OPEN_SUCCESS | OPEN_FAILURE | RECONCILE_REQUEST | RECONCILE_SUCCESS | RECONCILE_FAILURE | CLOSE_REQUEST | APP_LIFECYCLE_CHANGE(
EVENT_RECEIVEDwas in the issue sketch but is intentionally omitted — it does not drive phase transitions; the stream-event-handler routes events on its own. Add later if a real need arises.)Convention notes
Per the state-management updates landed today, the file follows the turn-store.ts pattern exactly:
requestOpen,onOpenSuccess,onOpenFailure,requestReconcile,onReconcileSuccess,onReconcileFailure,requestClose,onAppLifecycleChange). Nodispatch(event)— that's the Redux-holdover anti-pattern called out inCONVENTIONS.md.streamLifecycleReducerexported alongside as a parallel implementation tests exercise in isolation, mirroring howturnReduceris tested.createSelectorsfor atomic.use.field()selectors. NouseShallowin new code..test.tsusingbun:test, structured liketurn-store.test.ts(applyEventshelper, onedescribeper event, plus canonical end-to-end sequences).Concrete fixes the design encodes
(Delivered when the hook lands in phase 2 — phase 1 is the state-machine substrate.)
consecutiveFailures) replaces the brittle burst-limiter refs. Resets only onOPEN_SUCCESS.APP_LIFECYCLE_CHANGEevent with{ source, online }replaces the 1 s clock-based dedup with separate refs forvisibilitychangevs CapacitorappStateChange. The reducer dedups by phase, eliminating the cross-source race.open → RECONCILE_REQUEST → reconciling → RECONCILE_SUCCESS → opening. Fixes the "premature stream reopen whenconversationExistsOnServerflips" bug.OPEN_SUCCESS/OPEN_FAILUREcallbacks from prior streams are dropped by the reducer.Test coverage
54 tests, 138 assertions. Covers:
isStreamOpen,isStreamConnecting,isStreamPaused)Test plan
bun test src/domains/chat/stream-lifecycle-store.test.ts— 54 passbunx tsc --noEmit— cleanbun run linton new files — cleanhttps://claude.ai/code/session_018S9mQ5rmtUAvixYCcmXJaZ
Generated by Claude Code