Skip to content

fix: preserve prior-turn analysis messages in Harmony multi-turn conversations#35826

Closed
weiguangli-io wants to merge 2 commits intovllm-project:mainfrom
weiguangli-io:codex/vllm-35779-harmony-preserve-prior-analysis
Closed

fix: preserve prior-turn analysis messages in Harmony multi-turn conversations#35826
weiguangli-io wants to merge 2 commits intovllm-project:mainfrom
weiguangli-io:codex/vllm-35779-harmony-preserve-prior-analysis

Conversation

@weiguangli-io
Copy link
Copy Markdown
Contributor

Summary

  • Fix auto_drop_analysis_messages to only drop analysis messages within the current turn (after the last user message), preserving prior-turn reasoning context
  • The old implementation dropped ALL analysis messages before the last assistant final message, which incorrectly discarded reasoning_content from prior conversation turns

Root Cause

When a client sends a multi-turn conversation with reasoning_content on prior assistant messages:

[
  {"role": "user", "content": "Pick a secret number..."},
  {"role": "assistant", "content": "OK, I picked.", "reasoning_content": "I will pick 742."},
  {"role": "user", "content": "What is the secret number?"}
]

The analysis message from the first turn (containing "742") was dropped because the old code dropped ALL analysis messages before the last assistant final — regardless of turn boundaries. The model could not access its own prior reasoning.

Fix

Changed the algorithm to:

  1. Find the current turn boundary (after the last user message)
  2. Only drop analysis messages within the current turn that precede the last final message
  3. Leave prior-turn analysis messages untouched

Test Plan

  • Added test_preserves_prior_turn_analysis_messages — exact scenario from the issue
  • Added test_drops_current_turn_analysis_but_preserves_prior — multi-turn with both prior and current analysis
  • Added test_preserves_analysis_when_no_final_in_current_turn — ongoing reasoning not dropped
  • All existing TestAutoDropAnalysisMessages tests pass unchanged (verified via logic simulation)

Fixes #35779

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a bug in auto_drop_analysis_messages where analysis messages from previous turns in a multi-turn conversation were incorrectly dropped. The new logic correctly identifies the current turn and only drops analysis messages within that turn, preserving necessary context from prior turns. The added tests cover the fix and various multi-turn scenarios well. I've identified one potential critical issue where a user message could be inadvertently dropped and have provided a suggestion to make the logic more robust.

Comment on lines +218 to +224
return [
msg
for i, msg in enumerate(msgs)
if not (
current_turn_start <= i < last_final_in_turn and msg.channel == "analysis"
)
]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The current logic for dropping analysis messages doesn't check the message role. This could lead to dropping a user message if it happens to have channel="analysis", which would be a critical bug as it modifies user input. While analysis messages are typically not from the user, the function should be robust against this possibility.

To prevent this, I suggest adding a check to ensure we don't drop messages with the user role. The existing test test_drops_non_assistant_analysis_messages uses a TOOL role, which would still pass with this change.

Suggested change
return [
msg
for i, msg in enumerate(msgs)
if not (
current_turn_start <= i < last_final_in_turn and msg.channel == "analysis"
)
]
return [
msg
for i, msg in enumerate(msgs)
if not (
current_turn_start <= i < last_final_in_turn
and msg.channel == "analysis" and msg.author.role != "user"
)
]

@chaunceyjiang chaunceyjiang self-assigned this Mar 3, 2026
weiguangli-io and others added 2 commits March 3, 2026 15:45
…ersations

`auto_drop_analysis_messages` was dropping ALL analysis messages before
the last assistant final message, including from prior conversation
turns. This prevented clients from preserving reasoning context across
turns even when explicitly providing `reasoning_content`.

Change the algorithm to only drop analysis messages within the current
turn (after the last user message), preserving prior-turn analysis so
the model can access its own previous reasoning.

Fixes vllm-project#35779

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: liweiguang <codingpunk@gmail.com>
Address review feedback: add msg.author.role != "user" guard to
prevent accidentally filtering non-assistant messages from the
analysis channel.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: liweiguang <codingpunk@gmail.com>
@weiguangli-io weiguangli-io force-pushed the codex/vllm-35779-harmony-preserve-prior-analysis branch from 674fbfd to 0142105 Compare March 3, 2026 07:45
@weiguangli-io
Copy link
Copy Markdown
Contributor Author

Thanks for the review! I've adopted the defensive msg.author.role != "user" guard as suggested in commit 0142105. While analysis-channel messages should always be assistant-authored in practice, the extra check prevents any edge case from accidentally filtering user messages.

Both commits now also include Signed-off-by to pass the DCO check.

@sfeng33
Copy link
Copy Markdown
Contributor

sfeng33 commented Mar 6, 2026

Please see comments from #35779, the current behavior is intentional as per openai spec, this is not a bug.

@weiguangli-io
Copy link
Copy Markdown
Contributor Author

Closing — the current behavior is intentional per OpenAI spec (see #35779 discussion). Thanks for the clarification!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend gpt-oss Related to GPT-OSS models

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[Bug]: Harmony models incorrectly drops prior-turn analysis channel in multi-turn conversations

3 participants