Skip to content

fix(web): consolidate chat display-message hacks into sanitize pipeline#32009

Merged
dvargasfuertes merged 1 commit into
mainfrom
apollo/sanitize-display-messages
May 25, 2026
Merged

fix(web): consolidate chat display-message hacks into sanitize pipeline#32009
dvargasfuertes merged 1 commit into
mainfrom
apollo/sanitize-display-messages

Conversation

@vellum-apollo-bot
Copy link
Copy Markdown
Contributor

Bandaid PR. Adds a third short-term workaround for a UI duplicate-render bug and pulls all three workarounds into one place so the next person (or root-fix PR) only has to delete one helper.

What's new — hack #3

When the assistant stream produces a server-... stableId final message immediately followed by an assistant-... stableId final message with identical content, the UI was rendering both. removeDuplicateTrailingAssistant drops the trailing one when:

  • last two messages are both role: "assistant"
  • trailing row has substantive content (textSegments or toolCalls non-empty)
  • textSegments match position-for-position on type + content
  • toolCalls match position-for-position on toolName + result

The deeper backend fix is being handled separately — this PR is the visual patch.

Consolidation

New helper sanitizeDisplayMessages applied at the UI boundary, composed of three labeled sub-methods, each with a doc-block naming the upstream bug it patches and the condition under which it can be removed:

  1. sortByTimestamp — server can deliver messages out of order.
  2. removeInvalidMessages — phantom/blank rows during streaming.
  3. removeDuplicateTrailingAssistant — duplicate trailing assistant render (above).

Pure; never mutates input.

Callers

  • chat-route-content — runs the pipeline before passing messages to build-items.
  • debug-api#tail() — also runs the pipeline so tail() keeps mirroring what the UI renders.
  • build-items — its in-file isInvalidMessage helper was removed; comment now documents that callers must pre-sanitize.

Tests

  • New sanitize-display-messages.test.ts — 24 tests covering pass-through, no-mutation, sort isolation, blank/phantom drop, every negative case for the dup-trail predicate (different segments, different tool names, different results, length mismatch, single-assistant, three-in-a-row → only last drops, single-message, empty), and a three-hack integration test.
  • build-items.test.ts — removed the four migrated isInvalidMessage cases; updated section headers to point at the new file for negative coverage. Kept the pass-through assertions.
  • Locally: bun test green on touched files, bunx tsc --noEmit clean, bun run lint clean, bun run audit:cross-domain:check clean.

Centralizes three short-term UI-boundary workarounds into a single
`sanitizeDisplayMessages` helper:

1. sortByTimestamp — orders messages by createdAt.
2. removeInvalidMessages — drops phantom/blank rows.
3. removeDuplicateTrailingAssistant — drops a duplicate trailing
   assistant row when stableIds differ (`server-...` then
   `assistant-...`) but textSegments + toolCalls match. New hack
   patching a recent regression where the final assistant message
   renders twice.

Each sub-method carries a doc-comment explaining the upstream bug it
patches and the condition under which it can be removed. Pure — never
mutates input.

Callers updated: chat-route-content and debug-api#tail both run the
full pipeline so tail() keeps mirroring the UI. Migrated the existing
isInvalidMessage tests from build-items.test.ts to the new
sanitize-display-messages.test.ts.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eae85fadb6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +177 to +181
for (let i = 0; i < aTcs.length; i++) {
const aTc = aTcs[i]!;
const bTc = bTcs[i]!;
if (aTc.toolName !== bTc.toolName) return false;
if (aTc.result !== bTc.result) return false;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Tighten trailing-assistant dedupe to avoid dropping real updates

The dedupe predicate only compares each tool call by toolName and result, so two different assistant rows that carry the same tool/result but different rendered state (for example status: "running""completed", changed inputs, or other UI-relevant metadata) are treated as duplicates and the newer trailing row is removed. In those cases the transcript can show stale tool progress or omit the final state even though the backend emitted a real update; this is especially risky because this filter runs on every render before projection.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore this feedback, this matcher is intentional

Comment on lines +56 to +58
*
* `messages` must already be sanitized — phantom/blank row filtering and
* trailing-duplicate drops happen upstream in `sanitizeDisplayMessages`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this comment

[messages],
);

const transcriptItems = useMemo(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between transcriptItems and display messages? are they different data models? do we need both?

* factory so it can be unit-tested without a `window`.
*/
export function createChatDebugApi(refs: ChatDebugRefs): ChatDebugApi {
function tail(limit: number = DEFAULT_TAIL_LIMIT): DisplayMessage[] {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is doing too much logic, and it should be just reading from the same ref that eventually is looped over the react components. As a first pass, we can have a ref that is post sanitization so we could read here until we figure out what the heck transcript items are

// -----------------------------------------------------------------------------
// sanitizeDisplayMessages — single home for "this shouldn't be necessary,
// but is" frontend cleanup applied to DisplayMessage[] before the transcript
// renders (and before `window._vellumDebug.chat.tail()` returns).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// renders (and before `window._vellumDebug.chat.tail()` returns).
// renders.

Comment on lines +19 to +23
*
* Pipeline:
* 1. `sortByTimestamp` (Hack #1)
* 2. `removeInvalidMessages` (Hack #2)
* 3. `removeDuplicateTrailingAssistant` (Hack #3)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
*
* Pipeline:
* 1. `sortByTimestamp` (Hack #1)
* 2. `removeInvalidMessages` (Hack #2)
* 3. `removeDuplicateTrailingAssistant` (Hack #3)

It's redundant with implementation haha

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, we can do:

const pipeline = [
    sortByTimestamp,
    removeInvalidMessages,
    removeDuplicateTrailingAssistant,
];
return pipeline.reduce((msgs, cb) => cb(msgs), messages);

and that makes it literally the same as the comment

// server-side so the client never sees the fragmented rows.
// -----------------------------------------------------------------------------
function sortByTimestamp(messages: DisplayMessage[]): DisplayMessage[] {
return sortedByTimestamp(messages);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is sortedByTimestamp used anywhere else? I'm wondering if we should just inline here

Comment on lines +69 to +72
// - The daemon synthesises tool calls with `toolName === "unknown"` when a
// `tool_result` has no matching `tool_use`. Those arrive as empty user
// messages whose only payload is a list of "unknown" tools and would
// otherwise render as a confusing "Used unknown" chip (ATL-659).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// - The daemon synthesises tool calls with `toolName === "unknown"` when a
// `tool_result` has no matching `tool_use`. Those arrive as empty user
// messages whose only payload is a list of "unknown" tools and would
// otherwise render as a confusing "Used unknown" chip (ATL-659).
// - The assistant synthesises tool calls with `toolName === "unknown"` when a
// `tool_result` has no matching `tool_use`. Those arrive as empty user
// messages whose only payload is a list of "unknown" tools and would
// otherwise render as a confusing "Used unknown" chip.

Comment on lines +74 to +75
// SHORT TERM until: the daemon stops emitting orphan `tool_result` rows and
// phantom unknown-tool placeholders at history boundaries.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// SHORT TERM until: the daemon stops emitting orphan `tool_result` rows and
// phantom unknown-tool placeholders at history boundaries.
// SHORT TERM until: the assistant stops emitting orphan `tool_result` rows and
// phantom unknown-tool placeholders at history boundaries.

Comment on lines +177 to +181
for (let i = 0; i < aTcs.length; i++) {
const aTc = aTcs[i]!;
const bTc = bTcs[i]!;
if (aTc.toolName !== bTc.toolName) return false;
if (aTc.result !== bTc.result) return false;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore this feedback, this matcher is intentional

@dvargasfuertes dvargasfuertes merged commit 7e59eef into main May 25, 2026
7 checks passed
@dvargasfuertes dvargasfuertes deleted the apollo/sanitize-display-messages branch May 25, 2026 15:30
@vellum-apollo-bot
Copy link
Copy Markdown
Contributor Author

Follow-up PR with every piece of review feedback applied: #32010.

Comment Resolution in #32010
build-items.ts:58 — delete this comment Dropped the "must already be sanitized" line.
sanitize-display-messages.ts:4// renders. Trimmed the parenthetical (tail() no longer participates in the pipeline).
sanitize-display-messages.ts:23 — JSDoc redundant + use pipeline.reduce Adopted the pipeline.reduce pattern verbatim; dropped the JSDoc.
sanitize-display-messages.ts:57 — inline sortByTimestamp Dropped the local wrapper; sortedByTimestamp from message-sorting.ts is the pipeline step directly.
sanitize-display-messages.ts:72, 75 — docblock + SHORT TERM Applied both rewrites (assistant/daemon terminology, dropped inline ticket ref).
chat-route-content.tsx:788 — transcriptItems vs displayMessages Responded in chat — short version: they're different layers (logical conversation model vs virtualized render union); the duplication eliminated here was between two computations of "sanitized DisplayMessage[]", not between the two types themselves.
debug-api.ts:378 — read from a post-sanitization ref Done. New sanitizedMessagesRef owned by chat-page.tsx, populated by chat-route-content.tsx right after the render-boundary useMemo. tail() is now a one-liner reading from it; the sanitizeDisplayMessages import is gone from debug-api.ts. New contract test (reads from sanitizedMessagesRef, NOT raw messagesRef) pins the behaviour.

Codex P1 on line 181 — acknowledged your reply; matcher unchanged.

dvargasfuertes pushed a commit that referenced this pull request May 25, 2026
PR #32009 feedback follow-up:

* sanitize-display-messages.ts
  - Trim header comment: drop the parenthetical about
    `window._vellumDebug.chat.tail()` (tail() no longer touches the
    pipeline directly — see debug-api change below).
  - Replace the redundant JSDoc + sequential lets in
    `sanitizeDisplayMessages` with a `pipeline.reduce` over the three
    named steps — the implementation now is the spec.
  - Drop the local `sortByTimestamp` wrapper; the imported
    `sortedByTimestamp` is the named pipeline step directly.
  - Update Hack #2 docblock + SHORT TERM line to use the codebase's
    'assistant' terminology and drop the inline ticket reference.

* debug-api.ts
  - `tail()` is now logic-free: it reads from a new
    `sanitizedMessagesRef` populated by the render path. DevTools
    mirrors the UI without re-running the sanitize pipeline. Drops the
    `sanitizeDisplayMessages` import inside debug-api entirely.

* chat-page.tsx + chat-route-content.tsx
  - Own `sanitizedMessagesRef` in the composition root, thread it
    through `ChatRouteRefs`, populate it right after the
    `useMemo(() => sanitizeDisplayMessages(messages))`. Rename
    `sortedMessages` → `sanitizedMessages` to match what it actually
    is.

* build-items.ts
  - Remove the comment claiming `messages` must already be sanitized.
    The function takes whatever it's given; sanitization isn't its
    contract.

* debug-api.test.ts
  - Migrate the existing tail() tests to populate
    `sanitizedMessagesRef` (the new contract). Add a contract test
    that pins 'tail() reads from sanitizedMessagesRef, NOT raw
    messagesRef' so the two refs can't be silently swapped back.

Co-authored-by: vellum-apollo-bot[bot] <242025090+vellum-apollo-bot[bot]@users.noreply.github.com>
dvargasfuertes pushed a commit that referenced this pull request May 25, 2026
…tTranscriptItems() (#32013)

Follow-up to PR #32010 review feedback. Vargas chose Option C from the
DisplayMessage vs TranscriptItem design conversation: keep both arrays
visible side-by-side via the debug API before deciding which transcript
items should be promoted to messages or whether wrapper kinds can be
dropped.

Changes
-------
- Rename `chat.tail(n?)` → `chat.getClientMessages(n?)`. Same shape
  (`DisplayMessage[]`), same default limit, same source ref. The new
  name disambiguates "what the client thinks the message list is" from
  the server-side `serverMessages()` view.
- Add `chat.getTranscriptItems(): TranscriptItem[]`. The full
  virtualized row list — messages plus the thinking indicator, pending
  prompts, queued marker, error notices, and the onboarding-choice
  row. Logic-free reader of a new `transcriptItemsRef` populated by
  `chat-route-content.tsx` right after `buildTranscriptItems`.
- Migrate the one in-tree consumer: `share-feedback-modal.tsx` now
  attaches `clientMessages` + `transcriptItems` to the triage tar.

Review-feedback resolutions on PR #32010
----------------------------------------
- chat-route-content.tsx:226 — deleted `sanitizedMessagesRef` JSDoc.
- chat-route-content.tsx:797 — deleted "Mirror into a ref so …" comment.
- chat-route-content.tsx:800 — added `transcriptItemsRef` to
  `ChatRouteRefs`, populated in the same spot as `sanitizedMessagesRef`.
- chat-page.tsx:254 — deleted the JSDoc above `sanitizedMessagesRef`.
- chat-page.tsx:250 ("Do we need messagesRef anymore?") — kept.
  `messagesRef` is consumed by 5+ stream/send/secondary-action hooks
  (`use-send-message`, `use-stream-event-handler`,
  `use-conversation-secondary-actions`, `interaction-handlers`,
  `surface-handlers`) and a contract-pin test asserts the debug API
  reads sanitized, not raw. Migrating those consumers is its own
  follow-up.
- sanitize-display-messages.ts:46 — restored the local
  `sortByTimestamp` wrapper. `reconcileMessages` still imports
  `sortedByTimestamp` directly; the wrapper keeps this file's pipeline
  self-contained until the reconcile cleanup lands.

Tests
-----
- 64 tests pass across `debug-api.test.ts` + `sanitize-display-messages.test.ts`.
- New `createChatDebugApi.getTranscriptItems` describe block covers:
  empty case, reference-identity (`expect(result).toBe(items)`), full
  discriminated-union surface, and a contract pin asserting the impl
  reads from `transcriptItemsRef` (NOT derived from `sanitizedMessagesRef`).
- Renamed contract pin: `"reads from sanitizedMessagesRef, NOT raw
  messagesRef"` still applies to `getClientMessages`.
- Help-text test now asserts both `.getClientMessages(` and
  `.getTranscriptItems(` appear.

Verification
------------
- `bun test`: 64/64 pass.
- `bun run lint`: clean on all 6 touched files.
- `bun run audit:cross-domain:check`: clean.
- `bun run typecheck`: 10 errors, identical set to `origin/main`
  baseline (settings/ai, platform_actor_token, provisioned_storage_gib
  — all in untouched domains).

Refs: #32010, #32009

Co-authored-by: vellum-apollo-bot[bot] <220448527+vellum-apollo-bot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant