chore(wire): drop displayMessageId from message_complete + generation_handoff#32012
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6147541994
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| finalizeMessageComplete(prev, { | ||
| content: event.content, | ||
| displayMessageId, | ||
| displayMessageId: event.messageId, |
There was a problem hiding this comment.
Preserve legacy display-message fallback during rollout
Using event.messageId directly here drops compatibility with older daemons that still send messageId as the last persisted row and displayMessageId as the merged-history anchor. In a mixed-version deployment (fresh web, stale daemon), multi-call turns can finalize the live bubble under the wrong id, which then causes reconciliation/refresh to mismatch rows (duplicate or misplaced assistant content/attachments). Keep the fallback (displayMessageId ?? messageId) until daemon and web upgrades are guaranteed atomic.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
codex is right, this is a regression until we remove displayMessageId in finalizeMessageComplete. let's update this to finalizeMessageComplete(prev, event) and continue with the other improvements we identified in this PR
There was a problem hiding this comment.
Addressed in 5e23f80. Daemon reverted to messageId: state.lastAssistantMessageId. Web finalizeMessageComplete(prev, event) is now role-based: tail user → new bubble; tail assistant → keep tail.id. appendTextDelta line 119 flipped to id: last.id ?? messageId so the first id locks in across multi-LLM-call turns — the anchor is preserved client-side by arrival order.
There was a problem hiding this comment.
Addressed in 5e23f80. Daemon reverted to messageId: state.lastAssistantMessageId. Web finalizeMessageComplete(prev, event) is now role-based: tail user → new bubble; tail assistant → keep tail.id. appendTextDelta line 119 flipped to id: last.id ?? messageId so the first id locks in across multi-LLM-call turns — the anchor is preserved client-side by arrival order.
| finalizeMessageComplete(prev, { | ||
| content: event.content, | ||
| displayMessageId, | ||
| displayMessageId: event.messageId, |
There was a problem hiding this comment.
codex is right, this is a regression until we remove displayMessageId in finalizeMessageComplete. let's update this to finalizeMessageComplete(prev, event) and continue with the other improvements we identified in this PR
| const displayMessageId = event.displayMessageId ?? event.messageId; | ||
| ctx.setMessages((prev) => stopStreaming(prev, { displayMessageId })); | ||
| ctx.setMessages((prev) => | ||
| stopStreaming(prev, { displayMessageId: event.messageId }), |
There was a problem hiding this comment.
same thing here, we should do stopStreaming(prev, event) and make sure we're not reading displayMessageId
There was a problem hiding this comment.
Reply to Vargas comment 3299172021 (line 125 / stopStreaming): addressed in 5e23f80 — stopStreaming(prev, event) accepts the event but never reads displayMessageId (or anything else). Just flips isStreaming: false on the streaming assistant tail and keeps tail.id.
| ...(state.lastAssistantMessageId | ||
| ? { messageId: state.lastAssistantMessageId } | ||
| : {}), |
There was a problem hiding this comment.
these should still be state.lastAssistantMessageId, delete clientDisplayMessageId support in this method
There was a problem hiding this comment.
Addressed in 5e23f80. Reverted both emit sites to messageId: state.lastAssistantMessageId and dropped clientDisplayMessageId + the getClientDisplayMessageId helper. With that helper gone, firstAssistantMessageId on EventHandlerState became dead state — removed from the type, init, and assignment site (state-slimming win). Anchor preservation now lives entirely on the web side via appendTextDelta's first-id-wins lock + role-based finalizeMessageComplete.
|
Acknowledged on all three. The full picture I missed when separating these: Why the daemon-side anchor is going away: With the role-based Why merging is necessary now (not just nice-to-have): Without the One missing piece I want to flag: This PR will then carry:
Rolling now. |
6147541 to
5e23f80
Compare
|
Rolled — 5e23f80. Summary of the merged-scope PR: Daemon revert (per line 3302 feedback):
Web — anchor preservation moves client-side:
Tests: new coverage for the role-based shape, the Quality gates: assistant tsc clean. Web tsc clean (only the same pre-existing generated-API errors that exist on main). 43/43 stream-message-updaters tests pass. 71/71 event-parser tests pass. Broader chat-domain suite shows the same 12 fail / 2 errors baseline as main — pre-existing isolation flakiness, not introduced here. PR queue update: 2b.3 (remove |
dvargasfuertes
left a comment
There was a problem hiding this comment.
fix failing CI runs
| // First id wins — never overwritten. In a multi-LLM-call turn the | ||
| // daemon advances its `messageId` between calls (each call persists a | ||
| // new assistant row), but the merged history view collapses them to | ||
| // the first row's id. Locking bubble.id to the first id keeps the | ||
| // client view aligned with that merge. |
There was a problem hiding this comment.
no need for this comment
| opts?: { displayMessageId?: string }, | ||
| _event?: GenerationHandoffEvent, |
There was a problem hiding this comment.
we can delete this arg now actually
There was a problem hiding this comment.
Done in af96713 — stopStreaming(prev) is now arg-free. Handler call site updated; _event stays on the handleGenerationHandoff signature since the dispatcher passes it in, but underscored to mark it unused.
Today the daemon emits two ids on each terminal turn event: - `messageId` = the most recent persisted assistant row id - `displayMessageId` = the merged-history anchor (first row id) The web client always prefers `displayMessageId` via the fallback `event.displayMessageId ?? event.messageId`. That fallback is the contract that keeps live-stream ids aligned with the post-merge history rows the messages route returns. Two fields, one effective meaning. Collapse to one — and move anchor preservation entirely client-side: **Wire.** Drop `displayMessageId` from `MessageComplete` + `GenerationHandoff` on both daemon and web types. The daemon now emits the single `messageId: state.lastAssistantMessageId` (no anchor logic), shedding `firstAssistantMessageId` from `EventHandlerState` and the `getClientDisplayMessageId` helper that was its sole reader. **appendTextDelta.** Flip `id: messageId ?? last.id` to `id: last.id ?? messageId` — first id wins, never overwritten. In a multi-LLM-call turn the daemon advances its row id between calls, but the bubble's id stays locked to the first id seen. This is the anchor. **finalizeMessageComplete.** Rewritten role-based — accepts the event directly (`finalizeMessageComplete(prev, event)`): - tail user (or empty) → push a new finalized assistant bubble with `event.messageId` - tail assistant → flip `isStreaming: false`, finalize running tool calls, merge in content/attachments, **keep `tail.id`**. Subsequent `message_complete` events from later LLM calls in the same agent turn fold into the same bubble — mirroring the daemon's server-side merge. **stopStreaming.** Same simplification — accepts the event, flips `isStreaming: false`, never stamps `event.messageId` onto the bubble. **Handlers.** `handleMessageComplete` → `finalizeMessageComplete(prev, event)`. `handleGenerationHandoff` → `stopStreaming(prev, event)`. No id-conversion glue at the call sites anymore. Tests cover the new role-based shape, the anchor-lock in `appendTextDelta`, the multi-LLM-call append case, and inbound legacy-`displayMessageId` tolerance. Workstream: Chat Message State Slimming (PR 2b.1, now also folding in 2b.2 per review feedback on the prior commit).
5e23f80 to
af96713
Compare
…iant PR 2b.3 of the chat-state-slimming workstream. With the anchor invariant landed in 2b.1 (#32012) — assistant emits messageId = state.lastAssistantMessageId, client preserves the anchor via first-id-wins lock in appendTextDelta and role-based finalize in finalizeMessageComplete — the post-turn reconciliation loop that fired on every message_complete and activity_state idle is now redundant. The live client state is already correct; the refetch existed only to paper over id drift. Removes startReconciliationLoop calls from: - handleAssistantActivityState (idle phase) - handleMessageComplete Drops the now-unused epoch parameter from both handler signatures. The dispatcher still uses epoch for stale-epoch guards on the dispatch side, and reconciliation still fires on load / switch / SSE reopen / POST-resolve confirmation. cancelReconciliation calls remain since those still-spawned loops can be cancelled by text deltas and generation handoff.
…iant (#32024) PR 2b.3 of the chat-state-slimming workstream. With the anchor invariant landed in 2b.1 (#32012) — assistant emits messageId = state.lastAssistantMessageId, client preserves the anchor via first-id-wins lock in appendTextDelta and role-based finalize in finalizeMessageComplete — the post-turn reconciliation loop that fired on every message_complete and activity_state idle is now redundant. The live client state is already correct; the refetch existed only to paper over id drift. Removes startReconciliationLoop calls from: - handleAssistantActivityState (idle phase) - handleMessageComplete Drops the now-unused epoch parameter from both handler signatures. The dispatcher still uses epoch for stale-epoch guards on the dispatch side, and reconciliation still fires on load / switch / SSE reopen / POST-resolve confirmation. cancelReconciliation calls remain since those still-spawned loops can be cancelled by text deltas and generation handoff. Co-authored-by: vellum-apollo-bot[bot] <206299977+vellum-apollo-bot[bot]@users.noreply.github.com>
Workstream: Chat Message State Slimming → PR 2b.1 (1 of 3 in the wire/handler split that replaces #31994).
What
The terminal turn events (
message_complete,generation_handoff) currently carry two id fields:The web client always prefers
displayMessageIdvia the fallbackevent.displayMessageId ?? event.messageId. That fallback is the contract that keeps the live-stream id aligned with the merged history row the/v1/assistants/{id}/messages/route returns. Two fields, one effective meaning.This PR collapses them to one. The daemon now sends
messageId = anchordirectly; the wiredisplayMessageIdfield is removed. The web side drops the fallback.Why
When a single agent turn makes multiple LLM calls, the daemon persists multiple assistant rows in
messagesandmergeConsecutiveAssistantMessagescollapses them at read time to the first row (the "anchor"). The wire was leaking that pre-merge / post-merge ambiguity to the client. After this PR,messageIdon terminal events is unambiguously the id the merged history view returns — same id whether you came from SSE or refresh.Scope discipline
firstAssistantMessageId/lastAssistantMessageIdcontinue to live onEventHandlerState. Only the choice of which one gets stamped onto outbound events shifts.finalizeMessageComplete's param is still nameddisplayMessageId?:even though it's now fedevent.messageId. That rename + the function body refactor (role-based branch) is queued as the next PR in this split. Keeping this diff surgical.Changes
Daemon — wire types (
assistant/src/daemon/message-types/)messages.ts:MessageCompletedropsdisplayMessageId?:. JSDoc onmessageId?:updated to call out anchor semantics.conversations.ts: same forGenerationHandoff.Daemon — emit sites (
assistant/src/daemon/conversation-agent-loop.ts:3289-3333)messageIdnow sourced fromclientDisplayMessageId(=getClientDisplayMessageId(state)) for both terminal events. The separatedisplayMessageIdspread blocks are gone.Web — types + parser (
apps/web/src/domains/chat/api/)event-types.ts:MessageCompleteEventandGenerationHandoffEventdropdisplayMessageId?:.event-parser.ts: stops parsing the legacy field.displayMessageIdis silently ignored without breaking the rest of the payload (useful during rollout if a stale daemon is paired with a fresh web).Web — handler (
apps/web/src/domains/chat/utils/stream-handlers/message-handlers.ts)handleMessageCompleteandhandleGenerationHandoffstop applyingevent.displayMessageId ?? event.messageId. They now passevent.messageIdstraight through.displayMessageId→messageId.Testing
assistant tsc: ✅ cleanapps/web tsc: ✅ clean (pre-existing generated-API errors only — identical to main)apps/webtests: ✅ 173/173 acrossevent-parser.test.ts,stream-message-updaters.test.ts,reconcile.test.tsassistantmessage-complete-display-id.test.ts: ✅ 1/1assistantruntime-events-sse-parity.test.ts: ✅ 14/14assistantannotate-risk-options.test.ts: ✅ 4/4assistantconversation-queue.test.ts: ✅ 49/49 + 1 todoWhat's next (not in this PR)
This is part of a 3-PR sequence replacing the original #31994:
displayMessageIdfrom wire (this).finalizeMessageCompletesimplified to a role-based branch. Tail user → push new assistant bubble. Tail assistant → append to existing tail, keeptail.id.displayMessageIdparam renamed tomessageId(clean up the dissonance introduced here).startReconciliationLoop(epoch)fromhandleMessageComplete/handleGenerationHandoff. Reconcile only on initial load, conversation switch, and SSE reconnect (industry-consensus pattern). Kills the rerender shake.Replaces
#31994 — drafted as reference. That PR added state (
assistantTurnId,anchorContentWritten) which contradicted the workstream goal of slimming state, and bundled wire + impl + web changes too aggressively. This split lands the same end-state with zero state additions.