Skip to content

Inconcistency between Responses API and Chat Completions API on rate limit errors #2699

@aloumakos

Description

@aloumakos

Confirm this is an issue with the Python library and not an underlying OpenAI API

  • This is an issue with the Python library

Describe the bug

I noticed that rate limit errors are raised differently in the two APIs during streaming. In chat completions the error is raised and retried by the client itself on the initial call, before reading the stream. On the other hand, the first call to responses.create(...) returns a stream object successfully and then upon reading the stream, an APIError is raised indicating a rate limit. As a result, the whole retrying logic in the client is completely skipped and the user has to implement their own. It is also unclear whether an error could be raised mid-stream which adds to the complexity of the fix.

After inspecting the client's code, I initially thought that this is not a library issue but a misalignment with the model's service. I contacted Microsoft support since they host our models but they insisted I raise an issue here.

To Reproduce

To reproduce this you would need an Azure hosted OpenAI model ( I have personally already tried this with gpt-4o and gpt-5 ). Lower your token rate limit threshold to a low enough value and use the below snippet.

Code snippets

from openai import OpenAI

client = OpenAI(  
  base_url = "your-endpoint-v1",
  api_key="your-api-key"
)

response = client.responses.create(
    model="your-model",
    store=False,
    stream=True,
    input='a long enough prompt'*10000 # to hit a rate limit
)

for event in response:
   continue # an APIError is raised here

OS

Linux, Windows

Python version

Python >=3.11.9

Library version

openai v1.109.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions