Skip to content

fix(acp): avoid duplicate thought chunks around tool calls#8215

Merged
DOsinga merged 2 commits into
aaif-goose:mainfrom
kevklam:feature/acp-thinking-dedupe
Apr 2, 2026
Merged

fix(acp): avoid duplicate thought chunks around tool calls#8215
DOsinga merged 2 commits into
aaif-goose:mainfrom
kevklam:feature/acp-thinking-dedupe

Conversation

@kevklam
Copy link
Copy Markdown
Contributor

@kevklam kevklam commented Mar 30, 2026

Summary

Avoid replaying already-streamed thinking content in the user-visible filtered message when an assistant response also contains tool calls.

Root Cause

Some providers stream reasoning incrementally and then include the full accumulated thinking again on the assistant message that carries tool calls. Goose was surfacing both:

  • the incremental thought chunks as they arrived
  • the replayed accumulated thinking from the tool-call response

Over ACP this showed up as duplicate agent_thought_chunk notifications.

What Changed

When a response contains tool requests, categorize_tool_requests now keeps thinking content on the original response for provider/state compatibility, but omits it from the user-visible filtered message that gets emitted to the client.

Impact

ACP clients should stop seeing duplicate thought chunks around tool-call boundaries while Goose still preserves reasoning content where providers need it internally.

Validation

Built the CLI target only:

  • cargo build -p goose-cli --bin goose -j 1

@kevklam kevklam marked this pull request as ready for review March 31, 2026 03:51
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e11715321c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/goose/src/agents/reply_parts.rs Outdated
Comment on lines +399 to +400
MessageContent::Thinking(_) | MessageContent::RedactedThinking(_)
if has_tool_requests => {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve thought content for non-streaming tool responses

This match arm unconditionally drops Thinking/RedactedThinking whenever the response contains any tool request, but some providers emit a single combined assistant message (no prior incremental thought chunks) when streaming is disabled (for example, the non-streaming OpenAI paths return stream_from_single_message). In that case, this change removes the only reasoning content the client would receive for that turn, so ACP/CLI users lose thought output entirely around tool calls instead of just deduplicating replayed chunks.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. One proposed alternative is:

keep a per-turn flag in the agent stream loop around crates/goose/src/agents/agent.rs:1238,
  something like “have we already surfaced thinking to the client for this assistant turn?”. Then pass that into
  categorize_tool_requests and only suppress Thinking/RedactedThinking on tool-call messages when that flag is already
  true. After yielding a filtered message, if it still contains thinking, set the flag. That handles both cases:

  - Streaming replay bug: earlier thought chunk was already emitted, so the later tool-call replay gets suppressed.
  - Non-streaming or no-thought-stream providers: no earlier thought was emitted, so the combined tool-call message
    keeps its thinking.

Also looked at 'supports_streaming' as a flag, but that doesn't seem reliable. Codex points out that:

The OpenAI chat-completions streaming path does emit incremental
  reasoning and then can replay accumulated reasoning on the tool-call chunk in crates/goose/src/providers/formats/
  openai.rs:741 and crates/goose/src/providers/formats/openai.rs:806. But the Responses API streaming path does not emit
  thought chunks incrementally here; it only materializes final reasoning from output items in crates/goose/src/
  providers/formats/openai_responses.rs:35 and crates/goose/src/providers/formats/openai_responses.rs:790. So a provider
  can be “streaming” and still need the final thinking preserved.

i.e. a provider may stream thought chunks in one code path and not in the other even if the overall provider "supports streaming". So the only reliable metric is probably to explicitly track, "were thought chunks streamed or not".

@DOsinga
Copy link
Copy Markdown
Collaborator

DOsinga commented Apr 2, 2026

Hey @kevklam — the Codex P2 about unconditionally stripping thinking content is a real concern. Your proposed per-turn flag approach (track whether thinking was already surfaced to the client, and only suppress the replay when it was) sounds like the right fix. Could you push an update implementing that? We'll check back tomorrow. Thanks!

kevklam added 2 commits April 1, 2026 22:52
Signed-off-by: Kevin Lam <kevklam@users.noreply.github.com>
Signed-off-by: Kevin Lam <kevklam@users.noreply.github.com>
@kevklam kevklam force-pushed the feature/acp-thinking-dedupe branch from e117153 to 12d342f Compare April 2, 2026 05:52
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 12d342f1a0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +402 to +403
MessageContent::Thinking(_) | MessageContent::RedactedThinking(_)
if should_suppress_replayed_thinking => {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve new thought deltas in tool-call responses

This branch drops all Thinking/RedactedThinking content whenever should_suppress_replayed_thinking is true, but streamed providers can append fresh reasoning while tool-call chunks are still being assembled (e.g., OpenAI streaming accumulates delta.reasoning_content in response_to_streaming_message before emitting the final tool-call message). In that case, the final tool-call message contains both replayed and newly-added reasoning, and this unconditional drop removes the new suffix entirely from client-visible output instead of only deduplicating previously surfaced content.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator

@DOsinga DOsinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving — this is a clean, well-scoped bug fix for duplicate thought chunks in ACP streaming. The per-turn tracking flag correctly handles both streaming providers (where reasoning is replayed on tool-call messages) and non-streaming providers (where the tool-call message is the only source of reasoning). Tests are meaningful and cover both cases. Nice work!

@DOsinga DOsinga added this pull request to the merge queue Apr 2, 2026
Merged via the queue into aaif-goose:main with commit 1aa07bb Apr 2, 2026
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants