Managed OpenAI Inference + Native Web Search by noanflaherty · Pull Request #26251 · vellum-ai/vellum-assistant

noanflaherty · 2026-04-17T22:54:07Z

Summary

Enable OpenAI as a first-class managed inference provider and route inference-provider-native web search through OpenAI's native Responses web search tool. Remove Anthropic-only assumptions in macOS settings so users can choose OpenAI in managed mode.

Self-review result

PASS after 3 rounds of review+remediation (4 fix PRs addressing integration gaps)

PRs merged into feature branch

[managed] Enable OpenAI managed-proxy fallback routing in provider bootstrap #26211: [managed] Enable OpenAI managed-proxy fallback routing in provider bootstrap
[openai] Route inference-provider-native web search to Responses native web_search tool #26212: [openai] Route inference-provider-native web search to Responses native web_search tool
[macOS] Allow managed inference provider selection beyond Anthropic #26218: [macOS] Allow managed inference provider selection beyond Anthropic
[macOS] Allow provider-native web search when managed inference provider supports it #26230: [macOS] Allow provider-native web search when managed inference provider supports it

Fix PRs

fix: emit server_tool_start/complete events for OpenAI native web search #26240: fix: emit server_tool_start/complete events for OpenAI native web search
fix: render web_search_call output items in LLM context diagnostics #26243: fix: render web_search_call output items in LLM context diagnostics
fix: add server_tool_use content blocks and tests for OpenAI native web search #26246: fix: add server_tool_use content blocks and tests for OpenAI native web search
fix: diagnostics polish — web_search_call tests, tool count fix, loop consistency #26250: fix: diagnostics polish — web_search_call tests, tool count fix, loop consistency

Part of plan: managed-openai-native-web-search.md

…otstrap (#26211)

…ve web_search tool (#26212)

…26218) * [macOS] Allow managed inference provider selection beyond Anthropic * fix: capture draftProvider before async Task to prevent race condition Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…der supports it (#26230) * [macOS] Allow provider-native web search when managed inference provider supports it * fix: gate mode-specific web search invalidation on modeChanging to prevent false-positive alerts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: generalize alert message to cover both mode and provider changes * fix: scope provider-native invalidation to your-own web search mode --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…rch (#26240)

…26243)

…eb search (#26246)

… consistency (#26250)

noanflaherty · 2026-04-17T23:04:43Z

Review (Worktree + Targeted Validation)

Pulled this PR into a dedicated worktree and reviewed all touched files end-to-end (backend + macOS settings + tests).

Findings

1) [P1] Persisting unpaired `server_tool_use` from OpenAI native search can inject synthetic failures after provider switch

In assistant/src/providers/openai/responses-provider.ts (content assembly around lines 301-312), we now persist server_tool_use blocks for OpenAI web search calls.
Those blocks are persisted without a paired web_search_tool_result.
Anthropic’s request repair path explicitly treats this as orphaned tool state and injects synthetic web_search_tool_result_error entries (assistant/src/providers/anthropic/client.ts around lines 346-399).

Why this matters:
A conversation that used OpenAI native web search can be silently rewritten to include synthetic “search unavailable” failures when the user later switches provider to Anthropic in the same thread.

Suggested fix:

Either avoid persisting OpenAI server_tool_use blocks in history (keep event-stream + raw diagnostics only),
Or persist a paired result block so cross-provider handoff does not synthesize failure artifacts.

2) [P2] `server_tool_complete` is always emitted as success for OpenAI web search

In assistant/src/providers/openai/responses-provider.ts (lines ~285-291), server_tool_complete is emitted with isError: false for every tracked web-search call on response.completed.
This happens regardless of response/call status.

Why this matters:
Tool telemetry/UI can report successful completion even when the search call is interrupted/failed/incomplete.

Suggested fix:

Derive completion state from web-search call status (stream status events or final response output item status),
Emit isError: true when status is non-success,
Add a regression test for non-completed web search status.

If helpful, I can also add a concrete regression test scenario for the OpenAI->Anthropic provider-switch path described in Finding #1.

…ry corruption The OpenAI Responses provider emitted server_tool_use content blocks for web_search_call items but did not emit matching web_search_tool_result blocks. repairHistory() treats any unpaired server_tool_use as an interrupted search and injects a synthetic web_search_tool_result_error, which corrupts conversation history by making successful searches appear as failures. After each server_tool_use block, also emit a paired web_search_tool_result with empty content (since OpenAI weaves search results into the text output). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

devin-ai-integration

Devin Review found 1 new potential issue.

View 6 additional findings in Devin Review.

devin-ai-integration · 2026-04-17T23:16:57Z

+    managed: true,
+    proxyPath: "/v1/runtime-proxy/openai",


🚩 OpenAI managed proxy requires companion platform-side route support

The new proxy path /v1/runtime-proxy/openai at assistant/src/providers/managed-proxy/constants.ts:32 requires the companion vellum-assistant-platform repo to have a corresponding proxy route that forwards OpenAI Responses API requests. The AGENTS.md mentions checking the sibling repo for compatibility when making HTTP API or container changes. The OpenAI SDK will construct URLs like {baseURL}/responses for the Responses API stream endpoint, so the platform proxy needs to handle this path correctly.

Was this helpful? React with 👍 or 👎 to provide feedback.

… llm.default API The tests were added in #26251 against the old `setInferenceProvider` / `services.inference.provider` API. #26159 merged afterward renamed that API to `setLLMDefaultProvider` and moved the config path to `llm.default.provider`, leaving the tests unable to compile. Rename the calls and update the patch assertions to match the new shape.

… llm.default API (#26287) The tests were added in #26251 against the old `setInferenceProvider` / `services.inference.provider` API. #26159 merged afterward renamed that API to `setLLMDefaultProvider` and moved the config path to `llm.default.provider`, leaving the tests unable to compile. Rename the calls and update the patch assertions to match the new shape.

noanflaherty and others added 8 commits April 17, 2026 17:04

[managed] Enable OpenAI managed-proxy fallback routing in provider bo…

9c3f06f

…otstrap (#26211)

[openai] Route inference-provider-native web search to Responses nati…

6cd9b54

…ve web_search tool (#26212)

fix: emit server_tool_start/complete events for OpenAI native web sea…

1a42a48

…rch (#26240)

fix: render web_search_call output items in LLM context diagnostics (#…

3b08296

…26243)

fix: add server_tool_use content blocks and tests for OpenAI native w…

cdc330a

…eb search (#26246)

fix: diagnostics polish — web_search_call tests, tool count fix, loop…

c95d7a0

… consistency (#26250)

noanflaherty self-assigned this Apr 17, 2026

This comment was marked as resolved.

Sign in to view

noanflaherty merged commit f4a3ca9 into main Apr 17, 2026
12 checks passed

noanflaherty deleted the noanflaherty/managed-openai-native-web-search branch April 17, 2026 23:13

devin-ai-integration Bot reviewed Apr 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Managed OpenAI Inference + Native Web Search#26251

Managed OpenAI Inference + Native Web Search#26251
noanflaherty merged 9 commits into
mainfrom
noanflaherty/managed-openai-native-web-search

noanflaherty commented Apr 17, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

noanflaherty commented Apr 17, 2026

Uh oh!

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noanflaherty commented Apr 17, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Self-review result

PRs merged into feature branch

Fix PRs

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

noanflaherty commented Apr 17, 2026

Review (Worktree + Targeted Validation)

Findings

1) [P1] Persisting unpaired server_tool_use from OpenAI native search can inject synthetic failures after provider switch

2) [P2] server_tool_complete is always emitted as success for OpenAI web search

Uh oh!

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

noanflaherty commented Apr 17, 2026 •

edited by devin-ai-integration Bot

Loading

1) [P1] Persisting unpaired `server_tool_use` from OpenAI native search can inject synthetic failures after provider switch

2) [P2] `server_tool_complete` is always emitted as success for OpenAI web search