Skip to content

[bugfix] Fix online serving crash when text type response_format is received#26822

Merged
chaunceyjiang merged 10 commits intovllm-project:mainfrom
cjackal:validate-structured-output
Jan 16, 2026
Merged

[bugfix] Fix online serving crash when text type response_format is received#26822
chaunceyjiang merged 10 commits intovllm-project:mainfrom
cjackal:validate-structured-output

Conversation

@cjackal
Copy link
Copy Markdown
Contributor

@cjackal cjackal commented Oct 14, 2025

Purpose

Fix #26639.

Especially, this PR adds a proper input validation to StructuredOutpusParams not to generate unschedulable structured outputs params, and adjust sampling parameter generation logic in ChatCompletionRequest.to_sampling_parameters() and OpenAIServingResponses.create_responses() to reflect the change and be more robust.

co-authored by @j0shuajun who first reported the issue and minimal reproducible examples on our side.

Test Plan

Pass a chat completion request with "response_format": {"type": "text"}:

curl -XPOST http://localhost:8080/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"openai/gpt-oss-120b","messages":[{"role":"user","content":"hello"}],"max_tokens":2048,"stream":false,"response_format":{"type":"text"}}'

Plus, pass a chat completion request with "response_format": {"type": "json_object"} to check for regressions.

curl -XPOST http://localhost:8080/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"openai/gpt-oss-120b","messages":[{"role":"user","content":"hello"}],"max_tokens":2048,"stream":false,"response_format":{"type":"json_object"}}'

Test Result

Return a valid response in both cases.

# response_format text - respond with normal text
{"id":"chatcmpl-b6a22dfca265460ab316d7ea490d974b","object":"chat.completion","created":1760457543,"model":"openai/gpt-oss-120b","choices":[{"index":0,"messages":{"role":"assistant","content":"Hello! How can I assist you today?","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning_content":"We need to respond: greeting. No instructions."},"logprobs":null,"finish_reason":"stop","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":70,"total_tokens":99,"completion_tokens":29,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_tokens_ids":null,"kv_transfer_params":null}

# response_format json_object - respond with JSON object
{"id":"chatcmpl-9643189ba2674cc2b471bd9a63472e
69","object":"chat.completion","created":1760457566,"model":"openai/gpt-oss-120b","choices":[{"index":0,"messages":{"role":"assistant","content":"{\"response\":\"Hello! How can I assist you today?\"}","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning_content":"The user says \"hello\". We just respond politely."},"logprobs":null,"finish_reason":"stop","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":70,"total_tokens":104,"completion_tokens":34,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_tokens_ids":null,"kv_transfer_params":null}

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results

@mergify mergify bot added the frontend label Oct 14, 2025
@cjackal cjackal changed the title Fix online serving shutdown when chat completion with text type response_format is received [bugfix] Fix online serving shutdown when chat completion with text type response_format is received Oct 14, 2025
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a crash in the online serving endpoint when a chat completion request includes response_format: {"type": "text"}. The fix correctly prevents the creation of an invalid, empty StructuredOutputsParams object by adding stricter validation and refining the parameter handling logic. The changes are well-implemented. However, I've identified a critical regression where the new validation can cause crashes for requests using certain deprecated parameters. A fix is suggested in the detailed comment.

Comment on lines +69 to +73
if count < 1:
raise ValueError(
"You must use one kind of structured outputs constraint "
f"but none are specified: {self.__dict__}"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This new validation is a great improvement for ensuring StructuredOutputsParams is always in a valid state. However, it introduces a potential regression for deprecated parameters.

Specifically, the logic for handling deprecated guided_* parameters in vllm/entrypoints/openai/protocol.py (lines 794-806) can now raise this ValueError. If a user provides only guided_whitespace_pattern (which maps to whitespace_pattern), the code will attempt to create StructuredOutputsParams with only a non-constraint parameter. This will cause count to be 0 here, triggering this error.

While the problematic code is not in this diff, this change makes it faulty. To prevent this regression, the logic for handling deprecated parameters should be updated to only construct StructuredOutputsParams if at least one constraint parameter (e.g., guided_json, guided_regex) is provided.

For example, in ChatCompletionRequest.to_sampling_params in vllm/entrypoints/openai/protocol.py, the logic could be adjusted:

# ... inside to_sampling_params, after collecting kwargs from deprecated params
            kwargs = {k: v for k, v in kwargs.items() if v is not None}
            constraint_keys = {'json', 'regex', 'choice', 'grammar', 'structural_tag'}
            if any(k in constraint_keys for k in kwargs):
                self.structured_outputs = StructuredOutputsParams(**kwargs)

This would ensure backward compatibility for the deprecated parameters while upholding the new, stricter validation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK guided_whitespace_pattern must be given with one of the structural constraints, otherwise the same no structured parameter error is raised (so it is not a newly introduced regression but pre-existing bug).

Though I'd welcome a better way to validate the structured outputs params.

@cjackal
Copy link
Copy Markdown
Contributor Author

cjackal commented Oct 16, 2025

I think we can do much better response_format validation after #26519; will rework if this PR does not get merged before #26519

@cjackal cjackal changed the title [bugfix] Fix online serving shutdown when chat completion with text type response_format is received [bugfix] Fix online serving crash when chat completion with text type response_format is received Oct 16, 2025
@cjackal cjackal force-pushed the validate-structured-output branch from 247ab09 to 5fc84eb Compare October 22, 2025 15:07
@cjackal cjackal changed the title [bugfix] Fix online serving crash when chat completion with text type response_format is received [bugfix] Fix online serving crash when text type response_format is received Oct 22, 2025
@cjackal
Copy link
Copy Markdown
Contributor Author

cjackal commented Oct 22, 2025

@chaunceyjiang Would you mind having a look at this PR? As more clients are using response_format, this bug is increasingly disruptive in terms of server stability. I think StructuredOutputsParams without a compilable grammar, which crashes the server at the grammar compilation stage, should not be allowed to be created in the first place.

# we must enable it for these features to work
if self.structured_outputs is None:
self.structured_outputs = StructuredOutputsParams()
kwargs_changes = dict[str, Any]()

# Set structured output params for response format
if response_format is not None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if clause is redundant

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, recent code change in upstream makes this if clause funny 😄. Thank you for pointing it.


# Set structured output params for response format
if response_format is not None:
if response_format.type == "json_object":
self.structured_outputs.json_object = True
kwargs_changes["json_object"] = True
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to introduce kwargs_changes?

How about just replace self.structured_outputs.json_object = True with self.structured_outputs = StructuredOutputsParams(json_object=True)

The same applies to all the other cases.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd like to inherit from the original StructuredOutputsParams due to the other options like whitespace_pattern, so I updated over the existing parameters, not newly create it.

Would we like to ignore these options for the response_format codepath?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to mention that the whole logic around kwargs_changes and dataclasses.replace is there just to validate StructuredOutputParams. We can achieve the same effect by validating on assignment; Pydantic dataclasses already support this by adding ConfigDict(validate_assignment=True) and replacing __post_init__ to @pydantic.model_validator decorator.

cc @hmellor Is it the way to go from the point of pydantic validation refactoring? Most models on vllm.sampling_params are using msgpack and currently StructuredOutputParams is the only exception with a small comment on "maybe make msgpack". If we can keep the StructuredOutputParams pydantic, all the mess around the current state of StructuredOutputParams not validated during the creation/modification can be nicely gone.

@chaunceyjiang
Copy link
Copy Markdown
Collaborator

@cjackal I’m really sorry—I missed this PR. Could you rebase main onto your branch so we can move forward?

@mergify
Copy link
Copy Markdown

mergify bot commented Jan 13, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @cjackal.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Jan 13, 2026
@cjackal
Copy link
Copy Markdown
Contributor Author

cjackal commented Jan 13, 2026

@cjackal I’m really sorry—I missed this PR. Could you rebase main onto your branch so we can move forward?

No worries, I will rebase tonight. Or I have granted push permission to maintainers, feel free to rebase by yourself if urgent.

@cjackal cjackal force-pushed the validate-structured-output branch from 7e58459 to 7f389dd Compare January 13, 2026 14:26
@mergify mergify bot removed the needs-rebase label Jan 13, 2026
cjackal and others added 6 commits January 15, 2026 09:55
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
Signed-off-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Co-authored-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
Signed-off-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Co-authored-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
Signed-off-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Co-authored-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
Signed-off-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Co-authored-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
Signed-off-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Co-authored-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
Signed-off-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Co-authored-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
@cjackal cjackal force-pushed the validate-structured-output branch from 17505a7 to d96e330 Compare January 15, 2026 09:59
@mergify mergify bot removed the needs-rebase label Jan 15, 2026
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
Signed-off-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Co-authored-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
Signed-off-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Co-authored-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
@chaunceyjiang chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 15, 2026
@chaunceyjiang
Copy link
Copy Markdown
Collaborator

After #32127 is merged, vLLM will no longer crash. However, I still think this PR is a nice improvement.

@cjackal
Copy link
Copy Markdown
Contributor Author

cjackal commented Jan 15, 2026

After #32127 is merged, vLLM will no longer crash. However, I still think this PR is a nice improvement.

Indeed, this PR looks more like a general code quality improvement + unit test addition for now 😄 Thanks for the review!

@chaunceyjiang chaunceyjiang enabled auto-merge (squash) January 16, 2026 04:23
Copy link
Copy Markdown
Collaborator

@chaunceyjiang chaunceyjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks~

@chaunceyjiang chaunceyjiang merged commit 35bf5d0 into vllm-project:main Jan 16, 2026
50 checks passed
akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
…eceived (vllm-project#26822)

Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
Signed-off-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Co-authored-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
@cjackal cjackal deleted the validate-structured-output branch January 18, 2026 02:14
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
…eceived (vllm-project#26822)

Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
Signed-off-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Co-authored-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
…eceived (vllm-project#26822)

Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
Signed-off-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Co-authored-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working frontend ready ONLY add when PR is ready to merge/full CI is needed tool-calling

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[Bug]: ValueError: No valid structured output parameter found

3 participants