-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pydantic conversion logic for structured outputs is broken for models containing dictionaries #2004
Comments
Tagging @RobertCraigie for visibility, just in case (saw that you've been active on recent issues) :) |
I'm having the same issue, can confirm that models with dictionaries is the root problem. But i checked the documentation again, and they do talk about only allowing additionalProperties=false. |
@RobertCraigie Any updates or additional thoughts here? |
I have also encountered the same issue. After some tinkering, I found some more types that resulted in errors. The only buildin collection type that doesn't seem to be affected is the My code (python 3.13.1): import json
from pydantic import BaseModel
from openai.lib._pydantic import to_strict_json_schema
from openai import OpenAI
class Schema(BaseModel):
# Python buildin collections
# `range` and `bytearray` are not supported types, so I didn't include them
# tuple_field: tuple[int, int, int]
list_field: list[int]
# dict_field: dict[int, int]
# set_field: set[int]
# frozenset_field: frozenset[int]
# bytes_field: bytes
print(json.dumps(to_strict_json_schema(Schema), indent=4))
api_key = ...
with OpenAI(api_key=api_key) as client:
response = client.beta.chat.completions.parse(
model="gpt-4o",
messages=[{"role": "user", "content": "Fill the schema with random values"}],
response_format=Schema,
)
print(json.dumps(response.choices[0].message.model_dump()["content"], indent=4)) |
@RobertCraigie Any updates or additional thoughts here? |
Confirm this is an issue with the Python library and not an underlying OpenAI API
Describe the bug
There's a bug in OpenAI's python client logic for translating pydantic models with dictionaries into structured outputs JSON schema definitions: dictionaries are always required to be empty in the resulting JSON schema, rendering the dictionary outputs significantly less useful since the LLM is never allowed to populate them
I've filed a small PR to fix this and introduce test coverage: #2003
To Reproduce
Observe that the output inserts
additionalProperties: False
into the resulting JSON schema definition, meaning that the dictionary must always be empty:Code snippets
No response
OS
macOS
Python version
Python v3.10.12
Library version
1.59.6
The text was updated successfully, but these errors were encountered: