[Frontend] Enable generic structured_outputs for responses API#33709
[Frontend] Enable generic structured_outputs for responses API#33709zhuohan123 merged 8 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request enables generic structured_outputs for the responses API, which is a great enhancement. The implementation is straightforward and includes relevant tests. I've found one area for improvement regarding the conflict detection logic to make it more consistent and robust. My feedback is detailed in the review comment.
| # this cannot be used in conjunction with previous_response_id | ||
| # TODO: consider supporting non harmony messages as well | ||
| previous_input_messages: list[OpenAIHarmonyMessage | dict] | None = None | ||
| structured_outputs: StructuredOutputsParams | None = Field( |
There was a problem hiding this comment.
how does users access these from http endpoints? openai responses support passing structured output using text.format
There was a problem hiding this comment.
You can just set it directly for http, there is an example in the PR description
curl -X POST http://localhost:8000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-oss-20b",
"input": "Pick a color",
"structured_outputs": {
"choice": ["red", "green", "blue"]
}
}'
If we wanted to put it on text.format, we would have to implement our own new class which can differentiate the OpenAI ResponseFormatTextConfig type from the structured output type, which has already had annoying changes in the past.
IMO I think I prefer keeping the two fields separate because it would allow us to more clearly differentiate "the code needed to provide a complete Responses API implementation" from "extra features on top of responses API for vLLM specifically". Keeping it as the StructuredOutputsParams type would also mean that it is reusable across the different provider APIs longer term, it would be nice to be able to set the same thing for Anthropic apis and Openai apis to guide model behavior in a way that isn't explicitly tied to API functionality.
|
LGTM! |
Signed-off-by: Alec Solder <alecs@fb.com>
Signed-off-by: Alec Solder <alecs@fb.com>
Signed-off-by: Alec Solder <alecs@fb.com>
Head branch was pushed to by a user without write access
f590726 to
d5d7cd6
Compare
…project#33709) Signed-off-by: Alec Solder <alecs@fb.com> Co-authored-by: Alec Solder <alecs@fb.com> Signed-off-by: Eldar Kurtic <research@neuralmagic.com>
…project#33709) Signed-off-by: Alec Solder <alecs@fb.com> Co-authored-by: Alec Solder <alecs@fb.com>
…project#33709) Signed-off-by: Alec Solder <alecs@fb.com> Co-authored-by: Alec Solder <alecs@fb.com>
Purpose
The current ResponsesAPI implementation only supports setting an output text format using json_schema, however for more complicated use cases like grammars, regexes, choices, etc, you need to be able to pass in the full structured_outputs object
Test Plan
Test Result
Final output message:
Full response, showing it still respects only enabling it after reasoning