[Bugfix] Actually enable serialize_messages for harmony Responses (related to #26185)#27377
Open
jacobthebanana wants to merge 3 commits intovllm-project:mainfrom
Open
[Bugfix] Actually enable serialize_messages for harmony Responses (related to #26185)#27377jacobthebanana wants to merge 3 commits intovllm-project:mainfrom
jacobthebanana wants to merge 3 commits intovllm-project:mainfrom
Conversation
Contributor
There was a problem hiding this comment.
Code Review
This pull request correctly enables message serialization for Harmony Responses by calling model_dump(mode="json"). The change is a necessary fix for an upstream issue in openai/harmony, and the problem is well-described in the pull request. The code modification is simple, targeted, and correctly applied in both the create_responses and retrieve_responses functions. The inclusion of a TODO comment with a link to the upstream issue is good practice for maintainability. The change appears correct and complete, and I have no further suggestions.
jacobthebanana
referenced
this pull request
Oct 23, 2025
Signed-off-by: Andrew Xia <axia@meta.com> Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
89af97b to
21d9a79
Compare
…_response_messages is set. Signed-off-by: Jacob-Junqi Tian <jacob@banana.abay.cf>
Signed-off-by: Jacob-Junqi Tian <jacob@banana.abay.cf>
21d9a79 to
26d9bdc
Compare
Contributor
Author
|
(force-pushing to add sign-off) |
This was referenced Dec 2, 2025
|
This pull request has merge conflicts that must be resolved before it can be |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
For the OpenAI-compatible
v1/responsesroute, enable raw messages to be sent whenenable_response_messagesis set to True inextra_body.Previously, the responses are empty because of an issue in openai/harmony. (openai/harmony#78)
#26185 implements most of the fix, but these aren't actually invoked, at least not when serving the model through the
vllm serve. The reason is that the said PR specifieswhen_used="json". Thus, this serialization method is ignored because of the use ofmodel_dump()in vllm/entrypoints/openai/api_server.py#L527-L529.The fix is to trigger the serializers by setting
mode="json"when invokingmodel_dump.Test Plan
Start vLLM server
vllm serve openai/gpt-oss-20bSend a Response request with
enable_response_messagesset to True inextra_bodyRepeat the above for the streaming case.
Test Result
Original:
Details
``` "input_messages": [ { "author": { "role": "system", "name": null }, "content": [ {} ], "channel": null, "recipient": null, "content_type": null }, ... ], "output_messages": [ ... { "author": { "role": "assistant", "name": null }, "content": [ {} ], "channel": "final", "recipient": null, "content_type": null } ] } ```After adding
mode="json"Details
``` "input_messages": [ { "role": "system", "name": null, "content": [ { "model_identity": "You are ChatGPT, a large language model trained by OpenAI.", "reasoning_effort": "Medium", "conversation_start_date": "2025-10-22", "knowledge_cutoff": "2024-06", "channel_config": { "valid_channels": [ "analysis", "final" ], "channel_required": true }, "type": "system_content" } ] }, { "role": "user", "name": null, "content": [ { "type": "text", "text": "Write a haiku about autumn leaves." } ] } ], "output_messages": [ { "role": "assistant", "name": null, "content": [ { "type": "text", "text": "User wants a haiku about autumn leaves. Simple. Use 5-7-5 syllable structure. Let's produce one. Ensure it's about autumn leaves. Provide in one paragraph." } ], "channel": "analysis" }, { "role": "assistant", "name": null, "content": [ { "type": "text", "text": "Leaves whisper, fall— \ncrimson and amber drift down, \nautumn sighs in wind." } ], "channel": "final" } ] } ```Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.