fix(acp): avoid duplicate thought chunks around tool calls by kevklam · Pull Request #8215 · aaif-goose/goose

kevklam · 2026-03-30T23:06:43Z

Summary

Avoid replaying already-streamed thinking content in the user-visible filtered message when an assistant response also contains tool calls.

Root Cause

Some providers stream reasoning incrementally and then include the full accumulated thinking again on the assistant message that carries tool calls. Goose was surfacing both:

the incremental thought chunks as they arrived
the replayed accumulated thinking from the tool-call response

Over ACP this showed up as duplicate agent_thought_chunk notifications.

What Changed

When a response contains tool requests, categorize_tool_requests now keeps thinking content on the original response for provider/state compatibility, but omits it from the user-visible filtered message that gets emitted to the client.

Impact

ACP clients should stop seeing duplicate thought chunks around tool-call boundaries while Goose still preserves reasoning content where providers need it internally.

Validation

Built the CLI target only:

cargo build -p goose-cli --bin goose -j 1

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e11715321c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-31T03:55:05Z

+                MessageContent::Thinking(_) | MessageContent::RedactedThinking(_)
+                    if has_tool_requests => {}


Preserve thought content for non-streaming tool responses

This match arm unconditionally drops Thinking/RedactedThinking whenever the response contains any tool request, but some providers emit a single combined assistant message (no prior incremental thought chunks) when streaming is disabled (for example, the non-streaming OpenAI paths return stream_from_single_message). In that case, this change removes the only reasoning content the client would receive for that turn, so ACP/CLI users lose thought output entirely around tool calls instead of just deduplicating replayed chunks.

Useful? React with 👍 / 👎.

Good point. One proposed alternative is:

keep a per-turn flag in the agent stream loop around crates/goose/src/agents/agent.rs:1238, something like “have we already surfaced thinking to the client for this assistant turn?”. Then pass that into categorize_tool_requests and only suppress Thinking/RedactedThinking on tool-call messages when that flag is already true. After yielding a filtered message, if it still contains thinking, set the flag. That handles both cases: - Streaming replay bug: earlier thought chunk was already emitted, so the later tool-call replay gets suppressed. - Non-streaming or no-thought-stream providers: no earlier thought was emitted, so the combined tool-call message keeps its thinking.

Also looked at 'supports_streaming' as a flag, but that doesn't seem reliable. Codex points out that:

The OpenAI chat-completions streaming path does emit incremental reasoning and then can replay accumulated reasoning on the tool-call chunk in crates/goose/src/providers/formats/ openai.rs:741 and crates/goose/src/providers/formats/openai.rs:806. But the Responses API streaming path does not emit thought chunks incrementally here; it only materializes final reasoning from output items in crates/goose/src/ providers/formats/openai_responses.rs:35 and crates/goose/src/providers/formats/openai_responses.rs:790. So a provider can be “streaming” and still need the final thinking preserved.

i.e. a provider may stream thought chunks in one code path and not in the other even if the overall provider "supports streaming". So the only reliable metric is probably to explicitly track, "were thought chunks streamed or not".

DOsinga · 2026-04-02T00:27:09Z

Hey @kevklam — the Codex P2 about unconditionally stripping thinking content is a real concern. Your proposed per-turn flag approach (track whether thinking was already surfaced to the client, and only suppress the replay when it was) sounds like the right fix. Could you push an update implementing that? We'll check back tomorrow. Thanks!

Signed-off-by: Kevin Lam <kevklam@users.noreply.github.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 12d342f1a0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-02T05:56:38Z

+                MessageContent::Thinking(_) | MessageContent::RedactedThinking(_)
+                    if should_suppress_replayed_thinking => {}


Preserve new thought deltas in tool-call responses

This branch drops all Thinking/RedactedThinking content whenever should_suppress_replayed_thinking is true, but streamed providers can append fresh reasoning while tool-call chunks are still being assembled (e.g., OpenAI streaming accumulates delta.reasoning_content in response_to_streaming_message before emitting the final tool-call message). In that case, the final tool-call message contains both replayed and newly-added reasoning, and this unconditional drop removes the new suffix entirely from client-visible output instead of only deduplicating previously surfaced content.

Useful? React with 👍 / 👎.

DOsinga

Approving — this is a clean, well-scoped bug fix for duplicate thought chunks in ACP streaming. The per-turn tracking flag correctly handles both streaming providers (where reasoning is replayed on tool-call messages) and non-streaming providers (where the tool-call message is the only source of reasoning). Tests are meaningful and cover both cases. Nice work!

kevklam marked this pull request as ready for review March 31, 2026 03:51

chatgpt-codex-connector Bot reviewed Mar 31, 2026

View reviewed changes

kevklam added 2 commits April 1, 2026 22:52

fix(acp): avoid duplicate thought chunks around tool calls

6ad3697

Signed-off-by: Kevin Lam <kevklam@users.noreply.github.com>

fix(acp): only suppress replayed thoughts after streaming

12d342f

Signed-off-by: Kevin Lam <kevklam@users.noreply.github.com>

kevklam force-pushed the feature/acp-thinking-dedupe branch from e117153 to 12d342f Compare April 2, 2026 05:52

chatgpt-codex-connector Bot reviewed Apr 2, 2026

View reviewed changes

DOsinga approved these changes Apr 2, 2026

View reviewed changes

DOsinga added this pull request to the merge queue Apr 2, 2026

Merged via the queue into aaif-goose:main with commit 1aa07bb Apr 2, 2026
22 checks passed

github-actions Bot mentioned this pull request Apr 2, 2026

chore(release): release version 1.30.0 #8261

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(acp): avoid duplicate thought chunks around tool calls#8215

fix(acp): avoid duplicate thought chunks around tool calls#8215
DOsinga merged 2 commits into
aaif-goose:mainfrom
kevklam:feature/acp-thinking-dedupe

kevklam commented Mar 30, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 31, 2026

Uh oh!

kevklam Mar 31, 2026

Uh oh!

DOsinga commented Apr 2, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 2, 2026

Uh oh!

DOsinga left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		MessageContent::Thinking(_) \| MessageContent::RedactedThinking(_)
		if has_tool_requests => {}

		MessageContent::Thinking(_) \| MessageContent::RedactedThinking(_)
		if should_suppress_replayed_thinking => {}

Conversation

kevklam commented Mar 30, 2026

Summary

Root Cause

What Changed

Impact

Validation

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

kevklam Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

DOsinga commented Apr 2, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

DOsinga left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants