Skip to content

[Frontend] Enable generic structured_outputs for responses API#33709

Merged
zhuohan123 merged 8 commits intovllm-project:mainfrom
alecsolder:alecs/responses_grammar
Feb 13, 2026
Merged

[Frontend] Enable generic structured_outputs for responses API#33709
zhuohan123 merged 8 commits intovllm-project:mainfrom
alecsolder:alecs/responses_grammar

Conversation

@alecsolder
Copy link
Copy Markdown
Contributor

@alecsolder alecsolder commented Feb 3, 2026

Purpose

The current ResponsesAPI implementation only supports setting an output text format using json_schema, however for more complicated use cases like grammars, regexes, choices, etc, you need to be able to pass in the full structured_outputs object

Test Plan

vllm serve openai/gpt-oss-20b --enforce-eager --max-model-len=65536 \
--tool-call-parser=openai --enable-auto-tool-choice --reasoning-parser=openai_gptoss
curl -X POST http://localhost:8000/v1/responses \
    -H "Content-Type: application/json" \
    -d '{
      "model": "openai/gpt-oss-20b",
      "input": "Pick a color",
      "structured_outputs": {
        "choice": ["red", "green", "blue"]
      }
    }'

Test Result

Final output message:

{"id":"msg_946aeff87d4ed2e2","content":[{"annotations":[],"text":"green","type":"output_text","logprobs":null}],"

Full response, showing it still respects only enabling it after reasoning

{"id":"resp_be3abebb226eafdf","created_at":1770136083,"incomplete_details":null,"instructions":null,"metadata":null,"model":"openai/gpt-oss-20b","object":"response","output":[{"id":"rs_9664daf6b147c565","summary":[],"type":"reasoning","content":[{"text":"User says: \"Pick a color\". They want a color. Probably answer with a color name. We can also give maybe a suggestion like \"sky blue\" or just pick a random color; maybe include an RGB hex code. Probably pick one: like \"emerald green\". Let's pick \"emerald green (#50C878)\".","type":"reasoning_text"}],"encrypted_content":null,"status":null},{"id":"msg_946aeff87d4ed2e2","content":[{"annotations":[],"text":"green","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"}],"parallel_tool_calls":true,"temperature":1.0,"tool_choice":"auto","tools":[],"top_p":1.0,"background":false,"max_output_tokens":65468,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"reasoning":null,"service_tier":"auto","status":"completed","text":null,"top_logprobs":null,"truncation":"disabled","usage":{"input_tokens":68,"input_tokens_details":{"cached_tokens":64,"input_tokens_per_turn":[68],"cached_tokens_per_turn":[64]},"output_tokens":79,"output_tokens_details":{"reasoning_tokens":69,"tool_output_tokens":0,"output_tokens_per_turn":[79],"tool_output_tokens_per_turn":[0]},"total_tokens":147},"user":null,"input_messages":null,"output_messages":null}%

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables generic structured_outputs for the responses API, which is a great enhancement. The implementation is straightforward and includes relevant tests. I've found one area for improvement regarding the conflict detection logic to make it more consistent and robust. My feedback is detailed in the review comment.

# this cannot be used in conjunction with previous_response_id
# TODO: consider supporting non harmony messages as well
previous_input_messages: list[OpenAIHarmonyMessage | dict] | None = None
structured_outputs: StructuredOutputsParams | None = Field(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does users access these from http endpoints? openai responses support passing structured output using text.format

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just set it directly for http, there is an example in the PR description

curl -X POST http://localhost:8000/v1/responses \
    -H "Content-Type: application/json" \
    -d '{
      "model": "openai/gpt-oss-20b",
      "input": "Pick a color",
      "structured_outputs": {
        "choice": ["red", "green", "blue"]
      }
    }'

If we wanted to put it on text.format, we would have to implement our own new class which can differentiate the OpenAI ResponseFormatTextConfig type from the structured output type, which has already had annoying changes in the past.

IMO I think I prefer keeping the two fields separate because it would allow us to more clearly differentiate "the code needed to provide a complete Responses API implementation" from "extra features on top of responses API for vLLM specifically". Keeping it as the StructuredOutputsParams type would also mean that it is reusable across the different provider APIs longer term, it would be nice to be able to set the same thing for Anthropic apis and Openai apis to guide model behavior in a way that isn't explicitly tied to API functionality.

@daniel-salib
Copy link
Copy Markdown
Contributor

LGTM!

@zhuohan123 zhuohan123 enabled auto-merge (squash) February 9, 2026 18:07
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 9, 2026
Alec Solder added 3 commits February 10, 2026 21:55
Signed-off-by: Alec Solder <alecs@fb.com>
Signed-off-by: Alec Solder <alecs@fb.com>
Signed-off-by: Alec Solder <alecs@fb.com>
auto-merge was automatically disabled February 11, 2026 05:59

Head branch was pushed to by a user without write access

@alecsolder alecsolder force-pushed the alecs/responses_grammar branch from f590726 to d5d7cd6 Compare February 11, 2026 05:59
@zhuohan123 zhuohan123 merged commit be7370d into vllm-project:main Feb 13, 2026
5 of 6 checks passed
eldarkurtic pushed a commit to eldarkurtic/vllm that referenced this pull request Feb 19, 2026
…project#33709)

Signed-off-by: Alec Solder <alecs@fb.com>
Co-authored-by: Alec Solder <alecs@fb.com>
Signed-off-by: Eldar Kurtic <research@neuralmagic.com>
llsj14 pushed a commit to llsj14/vllm that referenced this pull request Mar 1, 2026
…project#33709)

Signed-off-by: Alec Solder <alecs@fb.com>
Co-authored-by: Alec Solder <alecs@fb.com>
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Mar 4, 2026
…project#33709)

Signed-off-by: Alec Solder <alecs@fb.com>
Co-authored-by: Alec Solder <alecs@fb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants