fix(web): consolidate chat display-message hacks into sanitize pipeline#32009
Conversation
Centralizes three short-term UI-boundary workarounds into a single `sanitizeDisplayMessages` helper: 1. sortByTimestamp — orders messages by createdAt. 2. removeInvalidMessages — drops phantom/blank rows. 3. removeDuplicateTrailingAssistant — drops a duplicate trailing assistant row when stableIds differ (`server-...` then `assistant-...`) but textSegments + toolCalls match. New hack patching a recent regression where the final assistant message renders twice. Each sub-method carries a doc-comment explaining the upstream bug it patches and the condition under which it can be removed. Pure — never mutates input. Callers updated: chat-route-content and debug-api#tail both run the full pipeline so tail() keeps mirroring the UI. Migrated the existing isInvalidMessage tests from build-items.test.ts to the new sanitize-display-messages.test.ts.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: eae85fadb6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| for (let i = 0; i < aTcs.length; i++) { | ||
| const aTc = aTcs[i]!; | ||
| const bTc = bTcs[i]!; | ||
| if (aTc.toolName !== bTc.toolName) return false; | ||
| if (aTc.result !== bTc.result) return false; |
There was a problem hiding this comment.
Tighten trailing-assistant dedupe to avoid dropping real updates
The dedupe predicate only compares each tool call by toolName and result, so two different assistant rows that carry the same tool/result but different rendered state (for example status: "running" → "completed", changed inputs, or other UI-relevant metadata) are treated as duplicates and the newer trailing row is removed. In those cases the transcript can show stale tool progress or omit the final state even though the backend emitted a real update; this is especially risky because this filter runs on every render before projection.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Ignore this feedback, this matcher is intentional
| * | ||
| * `messages` must already be sanitized — phantom/blank row filtering and | ||
| * trailing-duplicate drops happen upstream in `sanitizeDisplayMessages`. |
There was a problem hiding this comment.
Delete this comment
| [messages], | ||
| ); | ||
|
|
||
| const transcriptItems = useMemo( |
There was a problem hiding this comment.
What is the difference between transcriptItems and display messages? are they different data models? do we need both?
| * factory so it can be unit-tested without a `window`. | ||
| */ | ||
| export function createChatDebugApi(refs: ChatDebugRefs): ChatDebugApi { | ||
| function tail(limit: number = DEFAULT_TAIL_LIMIT): DisplayMessage[] { |
There was a problem hiding this comment.
This method is doing too much logic, and it should be just reading from the same ref that eventually is looped over the react components. As a first pass, we can have a ref that is post sanitization so we could read here until we figure out what the heck transcript items are
| // ----------------------------------------------------------------------------- | ||
| // sanitizeDisplayMessages — single home for "this shouldn't be necessary, | ||
| // but is" frontend cleanup applied to DisplayMessage[] before the transcript | ||
| // renders (and before `window._vellumDebug.chat.tail()` returns). |
There was a problem hiding this comment.
| // renders (and before `window._vellumDebug.chat.tail()` returns). | |
| // renders. |
| * | ||
| * Pipeline: | ||
| * 1. `sortByTimestamp` (Hack #1) | ||
| * 2. `removeInvalidMessages` (Hack #2) | ||
| * 3. `removeDuplicateTrailingAssistant` (Hack #3) |
There was a problem hiding this comment.
| * | |
| * Pipeline: | |
| * 1. `sortByTimestamp` (Hack #1) | |
| * 2. `removeInvalidMessages` (Hack #2) | |
| * 3. `removeDuplicateTrailingAssistant` (Hack #3) |
It's redundant with implementation haha
There was a problem hiding this comment.
In fact, we can do:
const pipeline = [
sortByTimestamp,
removeInvalidMessages,
removeDuplicateTrailingAssistant,
];
return pipeline.reduce((msgs, cb) => cb(msgs), messages);and that makes it literally the same as the comment
| // server-side so the client never sees the fragmented rows. | ||
| // ----------------------------------------------------------------------------- | ||
| function sortByTimestamp(messages: DisplayMessage[]): DisplayMessage[] { | ||
| return sortedByTimestamp(messages); |
There was a problem hiding this comment.
is sortedByTimestamp used anywhere else? I'm wondering if we should just inline here
| // - The daemon synthesises tool calls with `toolName === "unknown"` when a | ||
| // `tool_result` has no matching `tool_use`. Those arrive as empty user | ||
| // messages whose only payload is a list of "unknown" tools and would | ||
| // otherwise render as a confusing "Used unknown" chip (ATL-659). |
There was a problem hiding this comment.
| // - The daemon synthesises tool calls with `toolName === "unknown"` when a | |
| // `tool_result` has no matching `tool_use`. Those arrive as empty user | |
| // messages whose only payload is a list of "unknown" tools and would | |
| // otherwise render as a confusing "Used unknown" chip (ATL-659). | |
| // - The assistant synthesises tool calls with `toolName === "unknown"` when a | |
| // `tool_result` has no matching `tool_use`. Those arrive as empty user | |
| // messages whose only payload is a list of "unknown" tools and would | |
| // otherwise render as a confusing "Used unknown" chip. |
| // SHORT TERM until: the daemon stops emitting orphan `tool_result` rows and | ||
| // phantom unknown-tool placeholders at history boundaries. |
There was a problem hiding this comment.
| // SHORT TERM until: the daemon stops emitting orphan `tool_result` rows and | |
| // phantom unknown-tool placeholders at history boundaries. | |
| // SHORT TERM until: the assistant stops emitting orphan `tool_result` rows and | |
| // phantom unknown-tool placeholders at history boundaries. |
| for (let i = 0; i < aTcs.length; i++) { | ||
| const aTc = aTcs[i]!; | ||
| const bTc = bTcs[i]!; | ||
| if (aTc.toolName !== bTc.toolName) return false; | ||
| if (aTc.result !== bTc.result) return false; |
There was a problem hiding this comment.
Ignore this feedback, this matcher is intentional
|
Follow-up PR with every piece of review feedback applied: #32010.
Codex P1 on |
PR #32009 feedback follow-up: * sanitize-display-messages.ts - Trim header comment: drop the parenthetical about `window._vellumDebug.chat.tail()` (tail() no longer touches the pipeline directly — see debug-api change below). - Replace the redundant JSDoc + sequential lets in `sanitizeDisplayMessages` with a `pipeline.reduce` over the three named steps — the implementation now is the spec. - Drop the local `sortByTimestamp` wrapper; the imported `sortedByTimestamp` is the named pipeline step directly. - Update Hack #2 docblock + SHORT TERM line to use the codebase's 'assistant' terminology and drop the inline ticket reference. * debug-api.ts - `tail()` is now logic-free: it reads from a new `sanitizedMessagesRef` populated by the render path. DevTools mirrors the UI without re-running the sanitize pipeline. Drops the `sanitizeDisplayMessages` import inside debug-api entirely. * chat-page.tsx + chat-route-content.tsx - Own `sanitizedMessagesRef` in the composition root, thread it through `ChatRouteRefs`, populate it right after the `useMemo(() => sanitizeDisplayMessages(messages))`. Rename `sortedMessages` → `sanitizedMessages` to match what it actually is. * build-items.ts - Remove the comment claiming `messages` must already be sanitized. The function takes whatever it's given; sanitization isn't its contract. * debug-api.test.ts - Migrate the existing tail() tests to populate `sanitizedMessagesRef` (the new contract). Add a contract test that pins 'tail() reads from sanitizedMessagesRef, NOT raw messagesRef' so the two refs can't be silently swapped back. Co-authored-by: vellum-apollo-bot[bot] <242025090+vellum-apollo-bot[bot]@users.noreply.github.com>
…tTranscriptItems() (#32013) Follow-up to PR #32010 review feedback. Vargas chose Option C from the DisplayMessage vs TranscriptItem design conversation: keep both arrays visible side-by-side via the debug API before deciding which transcript items should be promoted to messages or whether wrapper kinds can be dropped. Changes ------- - Rename `chat.tail(n?)` → `chat.getClientMessages(n?)`. Same shape (`DisplayMessage[]`), same default limit, same source ref. The new name disambiguates "what the client thinks the message list is" from the server-side `serverMessages()` view. - Add `chat.getTranscriptItems(): TranscriptItem[]`. The full virtualized row list — messages plus the thinking indicator, pending prompts, queued marker, error notices, and the onboarding-choice row. Logic-free reader of a new `transcriptItemsRef` populated by `chat-route-content.tsx` right after `buildTranscriptItems`. - Migrate the one in-tree consumer: `share-feedback-modal.tsx` now attaches `clientMessages` + `transcriptItems` to the triage tar. Review-feedback resolutions on PR #32010 ---------------------------------------- - chat-route-content.tsx:226 — deleted `sanitizedMessagesRef` JSDoc. - chat-route-content.tsx:797 — deleted "Mirror into a ref so …" comment. - chat-route-content.tsx:800 — added `transcriptItemsRef` to `ChatRouteRefs`, populated in the same spot as `sanitizedMessagesRef`. - chat-page.tsx:254 — deleted the JSDoc above `sanitizedMessagesRef`. - chat-page.tsx:250 ("Do we need messagesRef anymore?") — kept. `messagesRef` is consumed by 5+ stream/send/secondary-action hooks (`use-send-message`, `use-stream-event-handler`, `use-conversation-secondary-actions`, `interaction-handlers`, `surface-handlers`) and a contract-pin test asserts the debug API reads sanitized, not raw. Migrating those consumers is its own follow-up. - sanitize-display-messages.ts:46 — restored the local `sortByTimestamp` wrapper. `reconcileMessages` still imports `sortedByTimestamp` directly; the wrapper keeps this file's pipeline self-contained until the reconcile cleanup lands. Tests ----- - 64 tests pass across `debug-api.test.ts` + `sanitize-display-messages.test.ts`. - New `createChatDebugApi.getTranscriptItems` describe block covers: empty case, reference-identity (`expect(result).toBe(items)`), full discriminated-union surface, and a contract pin asserting the impl reads from `transcriptItemsRef` (NOT derived from `sanitizedMessagesRef`). - Renamed contract pin: `"reads from sanitizedMessagesRef, NOT raw messagesRef"` still applies to `getClientMessages`. - Help-text test now asserts both `.getClientMessages(` and `.getTranscriptItems(` appear. Verification ------------ - `bun test`: 64/64 pass. - `bun run lint`: clean on all 6 touched files. - `bun run audit:cross-domain:check`: clean. - `bun run typecheck`: 10 errors, identical set to `origin/main` baseline (settings/ai, platform_actor_token, provisioned_storage_gib — all in untouched domains). Refs: #32010, #32009 Co-authored-by: vellum-apollo-bot[bot] <220448527+vellum-apollo-bot[bot]@users.noreply.github.com>
Bandaid PR. Adds a third short-term workaround for a UI duplicate-render bug and pulls all three workarounds into one place so the next person (or root-fix PR) only has to delete one helper.
What's new — hack #3
When the assistant stream produces a
server-...stableId final message immediately followed by anassistant-...stableId final message with identical content, the UI was rendering both.removeDuplicateTrailingAssistantdrops the trailing one when:role: "assistant"textSegmentsmatch position-for-position ontype+contenttoolCallsmatch position-for-position ontoolName+resultThe deeper backend fix is being handled separately — this PR is the visual patch.
Consolidation
New helper
sanitizeDisplayMessagesapplied at the UI boundary, composed of three labeled sub-methods, each with a doc-block naming the upstream bug it patches and the condition under which it can be removed:sortByTimestamp— server can deliver messages out of order.removeInvalidMessages— phantom/blank rows during streaming.removeDuplicateTrailingAssistant— duplicate trailing assistant render (above).Pure; never mutates input.
Callers
chat-route-content— runs the pipeline before passing messages tobuild-items.debug-api#tail()— also runs the pipeline sotail()keeps mirroring what the UI renders.build-items— its in-fileisInvalidMessagehelper was removed; comment now documents that callers must pre-sanitize.Tests
sanitize-display-messages.test.ts— 24 tests covering pass-through, no-mutation, sort isolation, blank/phantom drop, every negative case for the dup-trail predicate (different segments, different tool names, different results, length mismatch, single-assistant, three-in-a-row → only last drops, single-message, empty), and a three-hack integration test.build-items.test.ts— removed the four migratedisInvalidMessagecases; updated section headers to point at the new file for negative coverage. Kept the pass-through assertions.bun testgreen on touched files,bunx tsc --noEmitclean,bun run lintclean,bun run audit:cross-domain:checkclean.