Skip to content

Conversation

@Lagyu
Copy link

@Lagyu Lagyu commented Aug 23, 2025

Related GitHub Issue
Closes: #6862

Roo Code Task Context (Optional)
N/A

Description
This PR adds full Responses API support to the OpenAI Compatible provider while preserving all existing Chat Completions behavior. It addresses the “400 Unsupported parameter: 'messages'” error when users target the Responses endpoints (Azure and OpenAI) by building the correct payload and selecting the appropriate endpoint automatically.

Note that the manual toggle of the API style (auto or override with responses or chat completions) from settings was discussed in the original issue #6862 and I once implemented it, but I do not think that feature is required and removed. (Settings UI should be kept clean)

Key technical points:

  • API flavor selection (Auto/Responses/Chat)
    • Auto-detect from base URL path:
      • Path includes or ends with “/responses” → Responses API
      • Path includes “/chat/completions” → Chat Completions
  • Responses API payload builder
    • Convert input to a single string transcript (Developer: , User: ) per OpenAI Responses format (matches existing OpenAI Native handler)
    • Azure naming: send max_output_tokens for Responses flavor when includeMaxTokens is enabled
  • Azure Responses (v1 preview)
    • For Azure base URLs pointing to Responses, normalize SDK configuration to:
      • baseURL: https://{resource}.openai.azure.com/openai/v1
      • apiVersion: preview (azure only accept this in responses API flavor as far as I tested.)
    • Portal-style URL is automatically converted into the documented v1 style:
      • Portal: https://{res}.openai.azure.com/openai/responses?api-version=2025-04-01-preview
        • I do not know why, but Azure Portal provides this style.
      • v1 style: https://{res}.openai.azure.com/openai/v1/responses?api-version=preview
    • Model parameter is the Azure deployment name; stream usage preserved where applicable
  • Backwards compatibility
    • Chat Completions unchanged and remains the default for non-Responses URLs
    • Existing settings (rate limit, temperature, includeMaxTokens, etc.) remain intact

Files Modified

Test Procedure
Unit Tests (Vitest)

  • From workspace root, run:
    • cd src; npx vitest run api/providers/tests/openai.spec.ts
  • Coverage:
    • Auto-detect flavor
    • Responses payload shape (string input transcript)
    • Reasoning effort mapping (Responses vs Chat)
    • Verbosity handling (include text.verbosity; retry on 400 unknown)
    • Azure Responses normalization for both portal and v1 URLs
    • Azure naming (max_output_tokens for Responses; max_completion_tokens for Chat)

Pre-Submission Checklist

Screenshots / Videos
N/A (provider change; unit tests cover logic)

Documentation Updates
N/A

Additional Notes

  • Using Responses API flavor for GPT-5 is preferable to enable reuse of internal reasoning context and features like minimal reasoning
  • Chat Completions paths remain untouched to serve existing deployments seamlessly

Get in Touch
Discord: lagyu


Important

Adds Responses API support to OpenAI Compatible Provider with automatic endpoint selection and Azure-specific configurations.

  • Behavior:
    • Adds support for Responses API in OpenAiHandler, auto-selecting endpoint based on URL path.
    • Formats payload for Responses API, converting input to string transcript.
    • Handles Azure-specific configurations, normalizing URLs and setting apiVersion to preview.
  • Functions:
    • _resolveApiFlavor(), _formatResponsesInput(), _normalizeAzureResponsesBaseUrlAndVersion() added to openai.ts for API selection and payload formatting.
    • _yieldResponsesResult() and _extractResponsesText() handle Responses API output.
  • Tests:
    • Updated openai.spec.ts to test Responses API support, including URL normalization and payload verification.
    • Tests for auto-detection of API flavor and Azure-specific behavior.

This description was created by Ellipsis for f05544b. You can customize this summary. It will automatically update as commits are pushed.

Copy link

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! I've reviewed the changes and this is a solid implementation of Responses API support. The auto-detection logic is clever and the backward compatibility is well maintained. I have some suggestions for improvement, particularly around streaming support and type safety.

payload.max_output_tokens = this.options.modelMaxTokens || modelInfo.maxTokens
}

// NOTE: Streaming for Responses API isn't covered by current tests.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment indicates streaming isn't covered by tests and defaults to non-streaming. Since streaming is a critical feature for user experience, could we consider implementing streaming support for the Responses API? If not feasible now, should we at least document this limitation more prominently?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented the streaming based on the openai-native provider code.

// NOTE: Streaming for Responses API isn't covered by current tests.
// We call non-streaming for now to preserve stable behavior.
try {
const response: any = await (this.client as any).responses.create(payload)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional that we're casting to any here? The OpenAI client might not have a responses property. Should we add validation or a more graceful fallback?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed casting to any as much as I can.

}
}

private _toResponsesInput(anthropicMessages: Anthropic.Messages.MessageParam[]): Array<{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper method _toResponsesInput appears to be unused. Is this intentional or leftover from development? If it's for future use, could we add a comment explaining its purpose?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed to use it, allowing the image input with responses API.

this._isAzureOpenAiResponses(baseURL)

// Always use 'preview' for Azure Responses API calls (per user requirement)
const azureVersion = isResponsesFlavor
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API version is hardcoded to "preview" for Azure Responses. Could this break when Azure releases a stable version? Should we make this configurable or at least add a comment explaining why "preview" is always used?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now, only preview version is available and I cannot do anything for it.


// Verbosity: include via text.verbosity (Responses API expectation per openai-native handler)
if (this.options.verbosity || verbosity) {
;(payload as any).text = { verbosity: this.options.verbosity || verbosity }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple uses of as any bypass TypeScript's type checking. Could we define proper types for the Responses API to improve type safety?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed casting to any as much as I can.


// --- Responses helpers ---

private _resolveApiFlavor(baseUrl: string): "responses" | "chat" {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The auto-detection logic is clever but not immediately obvious. Could we add JSDoc comments explaining the URL pattern matching?


// -- Added Responses API tests (TDD) --

describe("OpenAI Compatible - Responses API", () => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great test coverage for the Responses API! Consider adding a test for streaming support once it's implemented. Also, it might be helpful to test the error case when the OpenAI client doesn't support the Responses API.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests for streaming support.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Aug 23, 2025
@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Aug 23, 2025
@hannesrudolph hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Aug 23, 2025
@Lagyu
Copy link
Author

Lagyu commented Aug 25, 2025

I found my implementation is not working with requests with Images.
Making fix for it and review comments from roomate.

… handling; tests

- Add previous_response_id retry path on 400 “Previous response … not found”
  - Non-streaming and streaming: drop previous_response_id and retry once; clear continuity state
  - Code: [src/api/providers/openai.ts](src/api/providers/openai.ts:238), [src/api/providers/openai.ts](src/api/providers/openai.ts:291), guard [OpenAiHandler._isPreviousResponseNotFoundError()](src/api/providers/openai.ts:934)

- Support GPT‑5-style reasoning summary and minimal effort on Responses API
  - Default enable summary: "auto" unless explicitly disabled in settings
  - Include reasoning: { effort: "minimal" | "low" | "medium" | "high", summary?: "auto" }
  - Code: constructor default [OpenAiHandler](src/api/providers/openai.ts:38), payload assembly [createMessage](src/api/providers/openai.ts:193)

- Improve Responses streaming event coverage
  - Handle response.content_part.added (emit text)
  - Handle response.audio_transcript.delta (emit text as transcript)
  - Preserve response.id via stream callback for continuity
  - Code: [handleResponsesStream](src/api/transform/responses-stream.ts:91), [src/api/transform/responses-stream.ts](src/api/transform/responses-stream.ts:47), responseId callback [src/api/transform/responses-stream.ts](src/api/transform/responses-stream.ts:19) and usage in [openai.ts](src/api/providers/openai.ts:283)

- Maintain conversation continuity for Responses API
  - Store lastResponseId on both streaming and non-streaming paths; pass previous_response_id unless suppressed
  - Code: stream wiring [src/api/providers/openai.ts](src/api/providers/openai.ts:283), non-streaming capture [src/api/providers/openai.ts](src/api/providers/openai.ts:889)

- Update and extend tests
  - Add tests for 400 previous_response_id retry (streaming and non-streaming)
  - Add tests for content_part and audio_transcript events
  - Add tests for reasoning minimal + summary auto, and summary disabling
  - Adjust expectation to allow summary in reasoning payload
  - Tests: [src/api/providers/__tests__/openai.spec.ts](src/api/providers/__tests__/openai.spec.ts:1663), [src/api/providers/__tests__/openai.spec.ts](src/api/providers/__tests__/openai.spec.ts:1170)

- Minor: default enableGpt5ReasoningSummary to true in compatible provider for Responses flows
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Aug 25, 2025
@Lagyu
Copy link
Author

Lagyu commented Aug 25, 2025

Now I have fixed the problems.
I manually tested on Azure OpenAI GPT-5 + Responses API and works fine.

I think we need to refactor to unify the implementation of the OpenAI Native provider and the OpenAI Compatitive provider, but that should be handled in another request and now I mostly duplicated the code from the OpenAI Native` provider.

src/package.json Outdated
"node-ipc": "^12.0.0",
"ollama": "^0.5.17",
"openai": "^5.0.0",
"openai": "^5.15.0",
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updaetd to "openai": "^5.15.0" because verbosity parameter is added for Responses API request.

@Lagyu Lagyu force-pushed the feature/openai-compatible-responses-api branch from c2aebf9 to cd51254 Compare August 27, 2025 03:13
@daniel-lxs
Copy link
Member

Thanks for the implementation! The Responses API support is working well, but before merging I think there are a few areas that could use some cleanup.

One thing I noticed is the error retry logic – the checks for _isPreviousResponseNotFoundError, _isVerbosityUnsupportedError, and _isInputTextInvalidError are repeated in multiple places (both in streaming and non-streaming paths). That duplication makes the code harder to maintain and could introduce bugs down the line.

Another area is type safety. There’s a lot of as unknown as casting, especially around _toResponsesInput and the client responses object. It feels like the type definitions could be tightened up so those casts aren’t necessary.

Lastly, createMessage has gotten pretty large (around 474 lines). It might make sense to pull the Responses API logic into separate methods to keep things easier to follow.

The functionality itself looks solid, I just think a bit of refactoring here would make the implementation easier to maintain in the long run.

@daniel-lxs daniel-lxs moved this from PR [Needs Prelim Review] to PR [Changes Requested] in Roo Code Roadmap Aug 27, 2025
Lagyu added 5 commits August 29, 2025 18:03
…reateWithRetries; dedupe checks for previous_response_id, verbosity, and Azure input_text invalid in streaming and non-streaming paths
…atible-responses-api

# Conflicts:
#	pnpm-lock.yaml
#	src/api/providers/openai.ts
#	src/package.json
…gate from createMessage

- Move Responses API logic to private _handleResponsesFlavor
- Preserve streaming, retries, conversation continuity, reasoning/verbosity, and usage
- All existing tests pass
@Lagyu
Copy link
Author

Lagyu commented Sep 5, 2025

@daniel-lxs Thank you for your feedback, and sorry for late reply.
I’ve addressed all three areas:

  • De-duplicated error/retry logic. Centralized the Responses error handling into a single _responsesCreateWithRetries path and removed the repeated checks for previous_response_id not found, verbosity unsupported, and Azure input_text invalid across both streaming and non-streaming flows.

  • Improved type safety. Reduced unnecessary as unknown as casts (especially around _toResponsesInput) and tightened the types where we interact with the client.responses object.

  • Split out createMessage. Extracted the Responses-specific logic into a private helper (_handleResponsesFlavor), so createMessage now delegates and is easier to follow. All existing tests pass.

Please take another look and let me know if you want any further changes.

Lagyu and others added 4 commits September 5, 2025 18:28
…nuity (previous_response_id/store), temp/verbosity gating, and image support (input_image/output_text)
…nscript for text-only, array for multimodal; retry-on-verbosity; continuity handling
@Lagyu
Copy link
Author

Lagyu commented Sep 5, 2025

I found conversation history in the session lost after retrying, so integrated upstream (openai-native.ts) changes from #7067.
Retry got better, and image handling should also be improved.

@Tamrac-web
Copy link

Tamrac-web commented Sep 16, 2025

My friend, you are the true hero.
image

@thomasmhofmann
Copy link

Is there any chance to get this merged soonish? I have been using my own build based on the PR and it works fine for me.

@daniel-lxs
Copy link
Member

daniel-lxs commented Sep 17, 2025

I'm not too sure about this PR. It would be best to have 2 clear paths in the code when the Responses API is enabled or disabled. Maybe we can base it on the OpenAI native provider, which was migrated to the Responses API recently.

The idea would be to have a checkbox to enable the Responses API in the provider, in which case the path with the Response API logic would be used.

I'm closing this PR for now. I'll try to scope the issue. Feel free to continue the discussion.

@daniel-lxs daniel-lxs closed this Sep 17, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Sep 17, 2025
@github-project-automation github-project-automation bot moved this from PR [Changes Requested] to Done in Roo Code Roadmap Sep 17, 2025
@Lagyu
Copy link
Author

Lagyu commented Sep 18, 2025

Thank you for the review and discussion.

I also felt this implementation became more complex than necessary.

I think we should:

  • Unify shared logic with openai-native.ts, so that common pieces can live in one place.
  • Extract a provider-agnostic Responses API layer (shared with openai-native.ts), to clearly separate ResponsesAPI specific logic from provider wiring.

This should reduce duplication and make the codebase more maintainable in future.

Note: For those urgently need responses API with Azure OpenAI implementation now, you can clone, install and build this branch with pnpm vsix command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request PR - Changes Requested size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

400 Unsupported parameter: 'messages'. error on OpenAI Compatible Azure deployment of GPT-5

5 participants