[Feature Request] Direct JSON Schema Support in OpenAI-Compatible API

Currently, to generate a response that conforms to a specific JSON schema using TensorRT-LLM's OpenAI-compatible API, i should set the `type` to `structural_tag` in `response_format`, as in:
https://github.com/NVIDIA/TensorRT-LLM/blob/b01d1c28f750c2c1bae928b200746dd206ae5cc5/tensorrt_llm/serve/openai_protocol.py#L138-L152
 
While this method works as intended, it has several inconveniences, such as having to mention in the prompt that a model should write begin and end tag.

Maybe it would be better to expose the existing JSON schema decoding capabilities of TensorRT-LLM's backend in the OpenAI-compatible API layer.

The core of this request revolves around a discrepancy between the backend's capabilities and the API's interface. The internal `GuidedDecodingParams` class already supports a `json` field, which allows for direct and powerful enforcement of a specific JSON schema.

https://github.com/NVIDIA/TensorRT-LLM/blob/37293e4dfd1269cd87a9bd03dec69754fdb8ed82/tensorrt_llm/sampling_params.py#L14-L36

-----

### Proposed Solution

To bridge the gap between the backend's capabilities and the API's interface, I propose enhancing the `response_format` object to directly leverage `GuidedDecodingParams`.

#### **Ideal API Request:**

```json
{
  "model": "your_model",
  "messages": [...],
  "response_format": {
    "type": "json",
    "schema": {
      "type": "object",
      "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"}
      },
      "required": ["name", "age"]
    }
  }
}
```

This would provide a clean, intuitive API that directly maps to the `GuidedDecodingParams` functionality. It would allow users to get a pure, schema-compliant JSON response without the overhead of wrapper tags, extra prompt engineering, or post-processing.

Thank you for considering this improvement.

	def _response_format_to_guided_decoding_params(
	response_format: Optional[ResponseFormat]
	) -> Optional[GuidedDecodingParams]:
	if response_format is None:
	return None
	elif response_format.type == "text":
	return None
	elif response_format.type == "json_object":
	return GuidedDecodingParams(json_object=True)
	elif response_format.type == "structural_tag":
	return GuidedDecodingParams(
	structural_tag=response_format.model_dump_json(by_alias=True,
	exclude_none=True))
	else:
	raise ValueError(f"Unsupported response format: {response_format.type}")

	class GuidedDecodingParams:
	"""Guided decoding parameters for text generation. Only one of the fields could be effective.

	Args:
	json (str, pydantic.main.BaseModel, dict, optional): The generated text is amenable to json format with additional user-specified restrictions, namely schema. Defaults to None.
	regex (str, optional): The generated text is amenable to the user-specified regular expression. Defaults to None.
	grammar (str, optional): The generated text is amenable to the user-specified extended Backus-Naur form (EBNF) grammar. Defaults to None.
	json_object (bool): If True, the generated text is amenable to json format. Defaults to False.
	structural_tag (str, optional): The generated text is amenable to the user-specified structural tag. Structural tag is supported by xgrammar in PyTorch backend only. Defaults to None.
	""" # noqa: E501

	json: Optional[Union[str, BaseModel, dict]] = None
	regex: Optional[str] = None
	grammar: Optional[str] = None
	json_object: bool = False
	structural_tag: Optional[str] = None

	def _validate(self):
	num_guides = 0
	for _field in fields(self):
	num_guides += bool(getattr(self, _field.name))
	if num_guides > 1:
	raise ValueError(f"Only one guide can be used for a request, but got {num_guides}.")

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] Direct JSON Schema Support in OpenAI-Compatible API #5952

Proposed Solution

Ideal API Request:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Direct JSON Schema Support in OpenAI-Compatible API #5952

Description

Proposed Solution

Ideal API Request:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions