fix: preserve prior-turn analysis messages in Harmony multi-turn conversations by weiguangli-io · Pull Request #35826 · vllm-project/vllm

weiguangli-io · 2026-03-03T04:19:39Z

Summary

Fix auto_drop_analysis_messages to only drop analysis messages within the current turn (after the last user message), preserving prior-turn reasoning context
The old implementation dropped ALL analysis messages before the last assistant final message, which incorrectly discarded reasoning_content from prior conversation turns

Root Cause

When a client sends a multi-turn conversation with reasoning_content on prior assistant messages:

[
  {"role": "user", "content": "Pick a secret number..."},
  {"role": "assistant", "content": "OK, I picked.", "reasoning_content": "I will pick 742."},
  {"role": "user", "content": "What is the secret number?"}
]

The analysis message from the first turn (containing "742") was dropped because the old code dropped ALL analysis messages before the last assistant final — regardless of turn boundaries. The model could not access its own prior reasoning.

Fix

Changed the algorithm to:

Find the current turn boundary (after the last user message)
Only drop analysis messages within the current turn that precede the last final message
Leave prior-turn analysis messages untouched

Test Plan

Added test_preserves_prior_turn_analysis_messages — exact scenario from the issue
Added test_drops_current_turn_analysis_but_preserves_prior — multi-turn with both prior and current analysis
Added test_preserves_analysis_when_no_final_in_current_turn — ongoing reasoning not dropped
All existing TestAutoDropAnalysisMessages tests pass unchanged (verified via logic simulation)

Fixes #35779

gemini-code-assist

Code Review

This pull request fixes a bug in auto_drop_analysis_messages where analysis messages from previous turns in a multi-turn conversation were incorrectly dropped. The new logic correctly identifies the current turn and only drops analysis messages within that turn, preserving necessary context from prior turns. The added tests cover the fix and various multi-turn scenarios well. I've identified one potential critical issue where a user message could be inadvertently dropped and have provided a suggestion to make the logic more robust.

gemini-code-assist · 2026-03-03T04:21:18Z

vllm/entrypoints/openai/parser/harmony_utils.py

+    return [
+        msg
+        for i, msg in enumerate(msgs)
+        if not (
+            current_turn_start <= i < last_final_in_turn and msg.channel == "analysis"
+        )
+    ]


The current logic for dropping analysis messages doesn't check the message role. This could lead to dropping a user message if it happens to have channel="analysis", which would be a critical bug as it modifies user input. While analysis messages are typically not from the user, the function should be robust against this possibility.

To prevent this, I suggest adding a check to ensure we don't drop messages with the user role. The existing test test_drops_non_assistant_analysis_messages uses a TOOL role, which would still pass with this change.

Suggested change

return [

msg

for i, msg in enumerate(msgs)

if not (

current_turn_start <= i < last_final_in_turn and msg.channel == "analysis"

)

]

return [

msg

for i, msg in enumerate(msgs)

if not (

current_turn_start <= i < last_final_in_turn

and msg.channel == "analysis" and msg.author.role != "user"

)

]

…ersations `auto_drop_analysis_messages` was dropping ALL analysis messages before the last assistant final message, including from prior conversation turns. This prevented clients from preserving reasoning context across turns even when explicitly providing `reasoning_content`. Change the algorithm to only drop analysis messages within the current turn (after the last user message), preserving prior-turn analysis so the model can access its own previous reasoning. Fixes vllm-project#35779 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: liweiguang <codingpunk@gmail.com>

Address review feedback: add msg.author.role != "user" guard to prevent accidentally filtering non-assistant messages from the analysis channel. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: liweiguang <codingpunk@gmail.com>

weiguangli-io · 2026-03-03T07:45:55Z

Thanks for the review! I've adopted the defensive msg.author.role != "user" guard as suggested in commit 0142105. While analysis-channel messages should always be assistant-authored in practice, the extra check prevents any edge case from accidentally filtering user messages.

Both commits now also include Signed-off-by to pass the DCO check.

sfeng33 · 2026-03-06T20:35:56Z

Please see comments from #35779, the current behavior is intentional as per openai spec, this is not a bug.

weiguangli-io · 2026-03-07T04:38:40Z

Closing — the current behavior is intentional per OpenAI spec (see #35779 discussion). Thanks for the clarification!

weiguangli-io requested review from DarkLight1337, NickLucche, aarnphm, chaunceyjiang, robertgshaw2-redhat and russellb as code owners March 3, 2026 04:19

weiguangli-io mentioned this pull request Mar 3, 2026

[Bug]: Harmony models incorrectly drops prior-turn analysis channel in multi-turn conversations #35779

Open

1 task

mergify bot added frontend gpt-oss Related to GPT-OSS models labels Mar 3, 2026

github-project-automation bot added this to gpt-oss Issues & Enhancements Mar 3, 2026

github-project-automation bot moved this to To Triage in gpt-oss Issues & Enhancements Mar 3, 2026

gemini-code-assist bot reviewed Mar 3, 2026

View reviewed changes

chaunceyjiang self-assigned this Mar 3, 2026

weiguangli-io and others added 2 commits March 3, 2026 15:45

weiguangli-io force-pushed the codex/vllm-35779-harmony-preserve-prior-analysis branch from 674fbfd to 0142105 Compare March 3, 2026 07:45

weiguangli-io closed this Mar 7, 2026

github-project-automation bot moved this from To Triage to Done in gpt-oss Issues & Enhancements Mar 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: preserve prior-turn analysis messages in Harmony multi-turn conversations#35826

fix: preserve prior-turn analysis messages in Harmony multi-turn conversations#35826
weiguangli-io wants to merge 2 commits intovllm-project:mainfrom
weiguangli-io:codex/vllm-35779-harmony-preserve-prior-analysis

weiguangli-io commented Mar 3, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 3, 2026

Uh oh!

weiguangli-io commented Mar 3, 2026

Uh oh!

sfeng33 commented Mar 6, 2026

Uh oh!

weiguangli-io commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

weiguangli-io commented Mar 3, 2026

Summary

Root Cause

Fix

Test Plan

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

weiguangli-io commented Mar 3, 2026

Uh oh!

sfeng33 commented Mar 6, 2026

Uh oh!

weiguangli-io commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants