fix(chat): wire assistant_turn_start, key appendTextDelta on messageId (B5) by vellum-apollo-bot[bot] · Pull Request #32612 · vellum-ai/vellum-assistant

vellum-apollo-bot · 2026-05-29T19:50:28Z

What

B3 (#32326) shipped daemon-side reserveMessage at llm_call_started — the daemon writes content: "[]" to SQLite and emits assistant_turn_start { messageId } from event zero of each LLM call. The web client wasn't ready: the case in use-stream-event-handler.ts was a literal no-op with a TODO promising to wire it, and appendTextDelta keyed on tail position rather than the event's messageId.

When the 5s reconcile poll fired before message_complete (any turn that takes >5s — the common case), it fetched the reserved row from /v1/assistants/{id}/messages/. That row landed in client state as an empty assistant message with no isStreaming flag. The next text delta arrived carrying the matching messageId — but tailIsStreamingAssistant returned false on the snapshot row, so createStreamingBubble pushed a NEW row with the same id. Two siblings, same id, both leaked through the sanitize pipeline (which only dedups different ids with matching content).

Changes

1. handleAssistantTurnStart — replaces the no-op in use-stream-event-handler.ts. Stamps currentAssistantMessageIdRef and, if a row with the event's messageId already exists in messages, flips it to isStreaming: true so appendTextDelta won't see a non-streaming tail. No-op on the row array in the common case where SSE precedes reconcile.

2. appendTextDelta keys on messageId — when messageId is present (B2/B3 invariant), looks up the matching assistant row by id and appends into it regardless of tail position. Falls back to tail-based decisioning when messageId is absent (pre-B2 daemons, until B4's floor bump pins them out). Extracts appendTextIntoRow shared by both branches and stamps isStreaming: true on the appended row.

Either change alone fixes the screenshot scenario. Both together cover the case where the reserved row isn't at the tail when a delta lands (e.g. mid-turn reconcile completes after a separate user message has been optimistically appended).

Tests

Reproduces the screenshot scenario directly (reserved row pulled in by reconcile → delta with matching id extends it, no duplicate). Plus the new-bubble-on-mismatched-id case (LLM call boundary inside a turn), the common consecutive-same-id deltas case, and four tests for handleAssistantTurnStart (ref stamp, flip-to-streaming, no-op when no match, no-op when already streaming).

Replaces the old "first-id-wins" test that was guarding a pre-B3 wire contract — under B3, deltas across LLM calls are separated by message_complete, so the synthetic scenario it covered no longer matches the wire.

Workstream

Solve Chat SSE, B-series PR 5.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5c5d20121c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-29T19:53:30Z

+      (m) => m.role === "assistant" && m.id === messageId,
+    );
+    if (idx >= 0) return appendTextIntoRow(prev, idx, text);
+    return createStreamingBubble(prev, text, messageId);


Keep later LLM-call deltas in the merged turn

When a turn has multiple LLM calls (for example text → tool → text), the daemon reserves a new messageId for the later call but the history endpoint still merges consecutive assistant rows onto the first row (mergeConsecutiveAssistantMessages in assistant/src/conversations/message-consolidation.ts). This branch now creates a separate client bubble for the later messageId; after the next reconcile/reload the server returns the merged first row while the local later-id row is preserved because its id is absent from the returned list, so the later answer/tool summary is shown twice. The previous first-id-wins behavior avoided this by keeping the whole display turn in one row until the server merge caught up.

Useful? React with 👍 / 👎.

this looks legit. In fact, messageId?: string should now be required

dvargasfuertes · 2026-05-29T20:31:57Z

+      (m) => m.role === "assistant" && m.id === messageId,
+    );
+    if (idx >= 0) return appendTextIntoRow(prev, idx, text);
+    return createStreamingBubble(prev, text, messageId);


this looks legit. In fact, messageId?: string should now be required

dvargasfuertes · 2026-05-29T20:32:56Z

+    let touched = false;
+    const next = prev.map((m) => {
+      if (m.role !== "assistant" || m.id !== event.messageId) return m;
+      if (m.isStreaming) return m;


I think we can use this PR as the opportunity to remove isStreaming as well. Instead, we should be replacing the latest isOptimistic assistant message with this new one

B3 ([#32326](#32326)) shipped daemon-side `reserveMessage` at `llm_call_started` — the daemon writes `content: "[]"` to SQLite and emits `assistant_turn_start { messageId }` from event zero of each LLM call. The web client wasn't ready: the case in `use-stream-event-handler.ts` was a literal no-op with a TODO promising to wire it, and `appendTextDelta` keyed on tail position rather than the event's `messageId`. When the 5s reconcile poll fired before `message_complete` (any turn that takes >5s — the common case), it fetched the reserved row from `/v1/assistants/{id}/messages/`. That row landed in client state as an empty assistant message with no `isStreaming` flag. The next text delta arrived carrying the matching `messageId` — but `tailIsStreamingAssistant` returned false on the snapshot row, so `createStreamingBubble` pushed a NEW row with the same id. Two siblings, same id, both leaked through the sanitize pipeline (which only dedups *different* ids with matching content). Two changes: 1. `handleAssistantTurnStart` — stamps `currentAssistantMessageIdRef` and, if a row with the event's `messageId` already exists in `messages`, flips it to `isStreaming: true` so `appendTextDelta` won't see a non-streaming tail. No-op on the row array in the common case where SSE precedes reconcile. 2. `appendTextDelta` — when `messageId` is present (B2/B3 invariant), looks up the matching assistant row by id and appends into it regardless of tail position. Falls back to tail-based decisioning when `messageId` is absent (pre-B2 daemons, until B4's floor bump pins them out). Either change alone fixes the screenshot scenario. Both together cover the case where the reserved row isn't at the tail when a delta lands (e.g. mid-turn reconcile completes after a separate user message has been optimistically appended). Extracts the per-row append into `appendTextIntoRow` shared by both branches and stamps `isStreaming: true` on the appended row — needed for the reconcile-pulled-row case where the snapshot landed without the flag. Tests cover the screenshot scenario directly (reserved row pulled in by reconcile → delta with matching id extends it, no duplicate), the new-bubble-on-mismatched-id case (LLM call boundary inside a turn), and the common consecutive-same-id deltas case. `handleAssistantTurnStart` has tests for the ref stamp, the flip-to-streaming behavior, the no-op when no row matches, and the no-op when the row is already streaming. Replaces the old "first-id-wins" test that was guarding a pre-B3 wire contract — under B3, deltas across LLM calls are separated by `message_complete`, so the synthetic scenario it covered no longer matches the wire.

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

dvargasfuertes requested changes May 29, 2026

View reviewed changes

vellum-apollo-bot Bot force-pushed the apollo/b5-fe-id-only-reconcile branch from 5c5d201 to df2d243 Compare May 30, 2026 12:02

vellum-apollo-bot Bot force-pushed the apollo/b5-fe-id-only-reconcile branch from df2d243 to 7c6f056 Compare May 30, 2026 14:33

dvargasfuertes approved these changes May 30, 2026

View reviewed changes

dvargasfuertes merged commit a4e4989 into main May 30, 2026
7 checks passed

dvargasfuertes deleted the apollo/b5-fe-id-only-reconcile branch May 30, 2026 14:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(chat): wire assistant_turn_start, key appendTextDelta on messageId (B5)#32612

fix(chat): wire assistant_turn_start, key appendTextDelta on messageId (B5)#32612
dvargasfuertes merged 1 commit into
mainfrom
apollo/b5-fe-id-only-reconcile

vellum-apollo-bot Bot commented May 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Uh oh!

dvargasfuertes May 29, 2026

Uh oh!

dvargasfuertes May 29, 2026

Uh oh!

dvargasfuertes May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vellum-apollo-bot Bot commented May 29, 2026

What

Changes

Tests

Workstream

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

dvargasfuertes May 29, 2026

Choose a reason for hiding this comment

Uh oh!

dvargasfuertes May 29, 2026

Choose a reason for hiding this comment

Uh oh!

dvargasfuertes May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant