Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions assistant/src/__tests__/agent-loop.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1769,6 +1769,85 @@ describe("AgentLoop", () => {
expect(messageCompletes).toHaveLength(2);
});

// Regression: when the model emits [text, tool_use] in a single turn and then
// returns an empty response after the tool result, the loop must NOT nudge —
// the model already delivered its reply before the tool call, and nudging
// would trick it into re-sending the same text verbatim.
test("does not nudge empty response when prior turn had visible text", async () => {
const textPlusToolUseResponse: ProviderResponse = {
content: [
{ type: "text", text: "your move, husband." },
{
type: "tool_use",
id: "t1",
name: "read_file",
input: { path: "/note.txt" },
},
],
model: "mock-model",
usage: { inputTokens: 10, outputTokens: 5 },
stopReason: "tool_use",
};
const emptyResponse: ProviderResponse = {
content: [],
model: "mock-model",
usage: { inputTokens: 10, outputTokens: 0 },
stopReason: "end_turn",
};

const { provider, calls } = createMockProvider([
textPlusToolUseResponse,
emptyResponse,
]);

const toolExecutor = async () => ({
content: "noted",
isError: false,
});

const loop = new AgentLoop(
provider,
"system",
{},
dummyTools,
toolExecutor,
);
const events: AgentEvent[] = [];
const history = await loop.run([userMessage], collectEvents(events));

// Provider called exactly 2 times: initial [text+tool_use], then empty.
// No third (retry) call because the prior turn had visible text.
expect(calls).toHaveLength(2);

// No nudge message should appear anywhere in history.
const nudgeInHistory = history.some(
(m) =>
m.role === "user" &&
m.content.some(
(b) =>
b.type === "text" &&
"text" in b &&
(b as { text: string }).text.includes(
"previous response was empty",
),
),
);
expect(nudgeInHistory).toBe(false);

// The [text, tool_use] assistant message is preserved in history.
const firstAssistant = history.find((m) => m.role === "assistant");
expect(firstAssistant).toBeDefined();
expect(firstAssistant!.content).toEqual([
{ type: "text", text: "your move, husband." },
{
type: "tool_use",
id: "t1",
name: "read_file",
input: { path: "/note.txt" },
},
]);
});

test("gives up after max empty response retries", async () => {
const emptyResponse: ProviderResponse = {
content: [],
Expand Down
28 changes: 27 additions & 1 deletion assistant/src/agent/loop.ts
Original file line number Diff line number Diff line change
Expand Up @@ -411,13 +411,34 @@ export class AgentLoop {
// This can happen when the model fails to produce output after
// receiving a large tool result. Retry once with a nudge before
// the message is persisted.
//
// Only nudge when the model hasn't already delivered text to the user
// earlier in this tool-use chain. If a prior assistant turn in history
// contained visible text (e.g. the model said its piece before calling
// a side-effect tool like `remember`), an empty follow-up is the model
// correctly ending its turn — nudging would mislead it into thinking
// its earlier text didn't land and cause a verbatim re-send.
const hasVisibleText = response.content.some(
(block) => block.type === "text" && block.text.trim().length > 0,
);
const priorAssistantHadVisibleText = (() => {
for (let i = history.length - 1; i >= 0; i--) {
const msg = history[i];
if (msg.role !== "assistant") continue;
return msg.content.some(
(block) =>
block.type === "text" &&
typeof (block as { text?: unknown }).text === "string" &&
(block as { text: string }).text.trim().length > 0,
);
}
return false;
})();
Comment on lines +424 to +436
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 priorAssistantHadVisibleText only checks the most recent assistant message instead of all prior ones

The backward scan at assistant/src/agent/loop.ts:424-436 uses return as soon as it finds the first (most recent) assistant message. If that message was a tool_use-only turn (no text), the function returns false — even if an earlier assistant turn in the chain had visible text. This contradicts the stated intent in the comment (lines 415-420): "If a prior assistant turn in history contained visible text".

Multi-step scenario where the bug manifests
  1. Assistant: [text "Here's what I found" + tool_use(read_file)] → pushed to history
  2. Tool result → pushed to history
  3. Assistant: [tool_use(write_file)] (no text) → pushed to history
  4. Tool result → pushed to history
  5. Model returns empty []

At step 5, the backward scan finds the step-3 assistant message first (tool_use only, no text) and immediately returns false. The text delivered in step 1 is never checked. The nudge fires incorrectly, potentially causing the model to re-send the step-1 text verbatim — the exact regression this PR aims to prevent.

Suggested change
const priorAssistantHadVisibleText = (() => {
for (let i = history.length - 1; i >= 0; i--) {
const msg = history[i];
if (msg.role !== "assistant") continue;
return msg.content.some(
(block) =>
block.type === "text" &&
typeof (block as { text?: unknown }).text === "string" &&
(block as { text: string }).text.trim().length > 0,
);
}
return false;
})();
const priorAssistantHadVisibleText = (() => {
for (let i = history.length - 1; i >= 0; i--) {
const msg = history[i];
if (msg.role !== "assistant") continue;
const hasText = msg.content.some(
(block) =>
block.type === "text" &&
typeof (block as { text?: unknown }).text === "string" &&
(block as { text: string }).text.trim().length > 0,
);
if (hasText) return true;
}
return false;
})();
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

if (
!hasVisibleText &&
toolUseBlocks.length === 0 &&
toolUseTurns > 0 &&
!priorAssistantHadVisibleText &&
emptyResponseRetries < MAX_EMPTY_RESPONSE_RETRIES
Comment on lines +441 to 442
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep retrying empty follow-ups after non-final preface text

The new !priorAssistantHadVisibleText guard disables the empty-response retry whenever the previous assistant turn had any text at all, but many tool calls start with preface text (for example, “Let me check that”) that is not the final answer. In that common flow ([text, tool_use] -> tool_result -> []), this branch now skips the recovery nudge and ends the loop with an empty assistant turn, so users can lose the actual post-tool result summary. This regresses the previous behavior, which retried once to recover a real response.

Useful? React with 👍 / 👎.

) {
emptyResponseRetries++;
Expand All @@ -437,7 +458,12 @@ export class AgentLoop {
continue;
}

if (!hasVisibleText && toolUseBlocks.length === 0 && toolUseTurns > 0) {
if (
!hasVisibleText &&
toolUseBlocks.length === 0 &&
toolUseTurns > 0 &&
!priorAssistantHadVisibleText
) {
rlog.error(
{ turn: toolUseTurns, retries: emptyResponseRetries },
"Model returned empty response after tool results — retries exhausted",
Expand Down
Loading