Skip to content

fix(chat): wire assistant_turn_start, key appendTextDelta on messageId (B5)#32612

Merged
dvargasfuertes merged 1 commit into
mainfrom
apollo/b5-fe-id-only-reconcile
May 30, 2026
Merged

fix(chat): wire assistant_turn_start, key appendTextDelta on messageId (B5)#32612
dvargasfuertes merged 1 commit into
mainfrom
apollo/b5-fe-id-only-reconcile

Conversation

@vellum-apollo-bot
Copy link
Copy Markdown
Contributor

What

B3 (#32326) shipped daemon-side reserveMessage at llm_call_started — the daemon writes content: "[]" to SQLite and emits assistant_turn_start { messageId } from event zero of each LLM call. The web client wasn't ready: the case in use-stream-event-handler.ts was a literal no-op with a TODO promising to wire it, and appendTextDelta keyed on tail position rather than the event's messageId.

When the 5s reconcile poll fired before message_complete (any turn that takes >5s — the common case), it fetched the reserved row from /v1/assistants/{id}/messages/. That row landed in client state as an empty assistant message with no isStreaming flag. The next text delta arrived carrying the matching messageId — but tailIsStreamingAssistant returned false on the snapshot row, so createStreamingBubble pushed a NEW row with the same id. Two siblings, same id, both leaked through the sanitize pipeline (which only dedups different ids with matching content).

Changes

1. handleAssistantTurnStart — replaces the no-op in use-stream-event-handler.ts. Stamps currentAssistantMessageIdRef and, if a row with the event's messageId already exists in messages, flips it to isStreaming: true so appendTextDelta won't see a non-streaming tail. No-op on the row array in the common case where SSE precedes reconcile.

2. appendTextDelta keys on messageId — when messageId is present (B2/B3 invariant), looks up the matching assistant row by id and appends into it regardless of tail position. Falls back to tail-based decisioning when messageId is absent (pre-B2 daemons, until B4's floor bump pins them out). Extracts appendTextIntoRow shared by both branches and stamps isStreaming: true on the appended row.

Either change alone fixes the screenshot scenario. Both together cover the case where the reserved row isn't at the tail when a delta lands (e.g. mid-turn reconcile completes after a separate user message has been optimistically appended).

Tests

Reproduces the screenshot scenario directly (reserved row pulled in by reconcile → delta with matching id extends it, no duplicate). Plus the new-bubble-on-mismatched-id case (LLM call boundary inside a turn), the common consecutive-same-id deltas case, and four tests for handleAssistantTurnStart (ref stamp, flip-to-streaming, no-op when no match, no-op when already streaming).

Replaces the old "first-id-wins" test that was guarding a pre-B3 wire contract — under B3, deltas across LLM calls are separated by message_complete, so the synthetic scenario it covered no longer matches the wire.

Workstream

Solve Chat SSE, B-series PR 5.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5c5d20121c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

(m) => m.role === "assistant" && m.id === messageId,
);
if (idx >= 0) return appendTextIntoRow(prev, idx, text);
return createStreamingBubble(prev, text, messageId);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep later LLM-call deltas in the merged turn

When a turn has multiple LLM calls (for example text → tool → text), the daemon reserves a new messageId for the later call but the history endpoint still merges consecutive assistant rows onto the first row (mergeConsecutiveAssistantMessages in assistant/src/conversations/message-consolidation.ts). This branch now creates a separate client bubble for the later messageId; after the next reconcile/reload the server returns the merged first row while the local later-id row is preserved because its id is absent from the returned list, so the later answer/tool summary is shown twice. The previous first-id-wins behavior avoided this by keeping the whole display turn in one row until the server merge caught up.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks legit. In fact, messageId?: string should now be required

(m) => m.role === "assistant" && m.id === messageId,
);
if (idx >= 0) return appendTextIntoRow(prev, idx, text);
return createStreamingBubble(prev, text, messageId);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks legit. In fact, messageId?: string should now be required

let touched = false;
const next = prev.map((m) => {
if (m.role !== "assistant" || m.id !== event.messageId) return m;
if (m.isStreaming) return m;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use this PR as the opportunity to remove isStreaming as well. Instead, we should be replacing the latest isOptimistic assistant message with this new one

@vellum-apollo-bot vellum-apollo-bot Bot force-pushed the apollo/b5-fe-id-only-reconcile branch from 5c5d201 to df2d243 Compare May 30, 2026 12:02
B3 ([#32326](#32326)) shipped daemon-side `reserveMessage` at `llm_call_started` —
the daemon writes `content: "[]"` to SQLite and emits `assistant_turn_start
{ messageId }` from event zero of each LLM call. The web client wasn't
ready: the case in `use-stream-event-handler.ts` was a literal no-op with
a TODO promising to wire it, and `appendTextDelta` keyed on tail position
rather than the event's `messageId`.

When the 5s reconcile poll fired before `message_complete` (any turn that
takes >5s — the common case), it fetched the reserved row from
`/v1/assistants/{id}/messages/`. That row landed in client state as an
empty assistant message with no `isStreaming` flag. The next text delta
arrived carrying the matching `messageId` — but `tailIsStreamingAssistant`
returned false on the snapshot row, so `createStreamingBubble` pushed a
NEW row with the same id. Two siblings, same id, both leaked through the
sanitize pipeline (which only dedups *different* ids with matching
content).

Two changes:

1. `handleAssistantTurnStart` — stamps `currentAssistantMessageIdRef` and,
   if a row with the event's `messageId` already exists in `messages`,
   flips it to `isStreaming: true` so `appendTextDelta` won't see a
   non-streaming tail. No-op on the row array in the common case where
   SSE precedes reconcile.

2. `appendTextDelta` — when `messageId` is present (B2/B3 invariant),
   looks up the matching assistant row by id and appends into it
   regardless of tail position. Falls back to tail-based decisioning
   when `messageId` is absent (pre-B2 daemons, until B4's floor bump
   pins them out).

Either change alone fixes the screenshot scenario. Both together cover
the case where the reserved row isn't at the tail when a delta lands
(e.g. mid-turn reconcile completes after a separate user message has
been optimistically appended).

Extracts the per-row append into `appendTextIntoRow` shared by both
branches and stamps `isStreaming: true` on the appended row — needed for
the reconcile-pulled-row case where the snapshot landed without the
flag.

Tests cover the screenshot scenario directly (reserved row pulled in
by reconcile → delta with matching id extends it, no duplicate), the
new-bubble-on-mismatched-id case (LLM call boundary inside a turn), and
the common consecutive-same-id deltas case. `handleAssistantTurnStart`
has tests for the ref stamp, the flip-to-streaming behavior, the no-op
when no row matches, and the no-op when the row is already streaming.

Replaces the old "first-id-wins" test that was guarding a pre-B3 wire
contract — under B3, deltas across LLM calls are separated by
`message_complete`, so the synthetic scenario it covered no longer
matches the wire.
@vellum-apollo-bot vellum-apollo-bot Bot force-pushed the apollo/b5-fe-id-only-reconcile branch from df2d243 to 7c6f056 Compare May 30, 2026 14:33
@dvargasfuertes dvargasfuertes merged commit a4e4989 into main May 30, 2026
7 checks passed
@dvargasfuertes dvargasfuertes deleted the apollo/b5-fe-id-only-reconcile branch May 30, 2026 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant