server: add OpenAI-compatible /v1/responses endpoint by krystophny · Pull Request #214 · waybarrios/vllm-mlx

krystophny · 2026-03-24T12:15:50Z

Summary

Add an OpenAI-compatible /v1/responses endpoint for local coding-agent workflows.

Scope

text messages, function tools, function call outputs
streaming and non-streaming Responses output
previous_response_id replay for persisted replayable input items
developer/instructions normalization onto one leading system prompt
request-level chat_template_kwargs forwarding
LRU-bounded response store (max 1000 entries, oldest evicted)
reasoning input items converted to assistant messages for model context
reasoning configuration gracefully ignored (not supported, no crash)

What changed

server.py: full /v1/responses endpoint with streaming SSE
api/responses_models.py: Pydantic models for Responses API
_responses_store capped with OrderedDict LRU eviction (max 1000)
ResponseReasoningItem input converted to assistant messages (Codex sends these in multi-turn)
Reasoning config (request.reasoning) logged and ignored instead of crashing mid-stream

Files

vllm_mlx/server.py
vllm_mlx/api/responses_models.py
tests/test_responses_api.py

Validation

$ python -m pytest tests/test_responses_api.py -v
33 passed

…onse_object Replace unbounded dict with OrderedDict (max 1000 entries) to prevent memory leaks from accumulated stored responses. Evict oldest entries on insert when the cap is exceeded. Remove the _stream_response_object function (190 lines) which was never called anywhere in the codebase.

… API Codex sends ResponseReasoningItem in the Responses API input array during multi-turn conversations. Convert reasoning content to assistant messages so the model sees its prior chain-of-thought. Previously this raised an HTTPException, but since the streaming response had already started, this caused a RuntimeError that broke the SSE stream mid-flight.

The reasoning input item rejection (HTTPException 400) was added in PR #28 but conflicts with the earlier fix that converts reasoning items to assistant messages in _responses_input_to_chat_messages. The rejection ran first, crashing the SSE stream mid-flight. Also downgrade reasoning config rejection to a debug log since raising inside the streaming generator causes "response already started" crashes.

krystophny mentioned this pull request Mar 24, 2026

responses: normalize developer and instructions for Codex #219

Closed

krystophny changed the title ~~Add OpenAI Responses API core~~ server: add OpenAI-compatible /v1/responses endpoint Mar 24, 2026

krystophny force-pushed the feature/openai-responses-api branch from c7f7364 to ad483cc Compare March 24, 2026 12:26

krystophny changed the title ~~server: add OpenAI-compatible /v1/responses endpoint~~ server: add non-streaming OpenAI-compatible /v1/responses endpoint Mar 24, 2026

krystophny changed the title ~~server: add non-streaming OpenAI-compatible /v1/responses endpoint~~ server: add OpenAI-compatible /v1/responses endpoint Mar 24, 2026

This was referenced Mar 25, 2026

[Tracking] Upstream backlog and merge plan computor-org/vllm-mlx#12

Open

server: close out the upstream /v1/responses merge plan computor-org/vllm-mlx#21

Closed

server: add OpenAI-compatible /v1/responses endpoint

05838da

krystophny force-pushed the feature/openai-responses-api branch from df4f9af to 05838da Compare March 25, 2026 22:52

krystophny added 3 commits March 26, 2026 01:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: add OpenAI-compatible /v1/responses endpoint#214

server: add OpenAI-compatible /v1/responses endpoint#214
krystophny wants to merge 4 commits intowaybarrios:mainfrom
computor-org:feature/openai-responses-api

krystophny commented Mar 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

krystophny commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Scope

What changed

Files

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

krystophny commented Mar 24, 2026 •

edited

Loading