Studio: add remote MCP server support#5750
Conversation
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ef9d341a1b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| display = server.get("display_name") or server["id"] | ||
| specs: list[dict] = [] | ||
| for tool in mcp_tools: | ||
| name = f"{MCP_TOOL_PREFIX}{server['id']}__{tool.get('name') or ''}" |
There was a problem hiding this comment.
Validate MCP tool names before forwarding to OpenAI
The composed function name uses the remote MCP tool name verbatim, but no character validation is applied. If any enabled server exposes a name containing characters outside the OpenAI function-name charset (e.g. dots, spaces, colons), the whole chat completion request can fail with a 400 before streaming starts instead of skipping/normalizing that tool; this turns one incompatible remote tool into a hard failure for all tool-enabled chats.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review
This pull request introduces support for the Model Context Protocol (MCP), enabling the integration of external tools into the chat interface. Key changes include a new backend client for MCP server interaction, SQLite storage for server configurations, and FastAPI routes for managing these servers. The frontend has been updated with a management dialog and a toggle in the chat settings to enable MCP tools. Feedback focuses on optimizing performance by moving the filtering of enabled servers from Python logic into a dedicated database query method.
| conn.close() | ||
|
|
||
|
|
||
| def list_servers() -> list[dict]: |
There was a problem hiding this comment.
Consider adding a list_enabled_servers() method to the database module. This would allow fetching only the active servers directly from the database, avoiding the overhead of fetching all servers and filtering them in Python during every chat request.
| def list_servers() -> list[dict]: | |
| def list_enabled_servers() -> list[dict]: | |
| conn = get_connection() | |
| try: | |
| rows = conn.execute( | |
| "SELECT * FROM mcp_servers WHERE is_enabled = 1 ORDER BY created_at" | |
| ).fetchall() | |
| return [dict(row) for row in rows] | |
| finally: | |
| conn.close() |
References
- To improve efficiency, avoid redundant data iterations. Combine checks and transformations into a single loop or query and return computed values for callers to reuse.
| or {"type": "object", "properties": {}}, | ||
| }, | ||
| } | ||
| ) |
There was a problem hiding this comment.
Instead of fetching all servers and filtering in Python, use a dedicated database method to fetch only enabled servers. This improves efficiency, especially as the number of registered servers grows.
| ) | |
| servers = mcp_servers_db.list_enabled_servers() |
References
- To improve efficiency, avoid redundant data iterations. Combine checks and transformations into a single loop and return computed values for callers to reuse.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 49e10421cc
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| return call_tool_sync( | ||
| url = server["url"], | ||
| headers = parse_server_headers(server), | ||
| name = tool_name, | ||
| args = arguments, | ||
| timeout = effective_timeout, | ||
| use_oauth = bool(server.get("use_oauth")), |
There was a problem hiding this comment.
Propagate cancellation into MCP tool execution
This new MCP branch ignores cancel_event, so an in-flight remote tool call cannot be interrupted when the user cancels/disconnects; the worker thread stays blocked until timeout (default up to 300s). In the tool-streaming paths, cancellation is polled between next() calls, so this blocking call delays teardown and can tie up worker capacity under slow/hung MCP servers. Please thread cancellation through the MCP call path (or use shorter cancellable waits) to match existing tool behavior.
Useful? React with 👍 / 👎.
| checked={mcpEnabledForChat} | ||
| onCheckedChange={setMcpEnabledForChat} | ||
| disabled={enabledServerCount === 0} |
There was a problem hiding this comment.
Allow disabling MCP toggle even when no servers are enabled
Disabling the switch whenever enabledServerCount === 0 makes it impossible to turn MCP off if it was previously enabled and the user later disables/deletes all servers. In that state, mcpEnabledForChat can remain stuck true, and the request builder still emits enable_tools: true + mcp_enabled: true, pushing chats through the tool path with no available MCP tools until the user re-enables a server just to turn the toggle off.
Useful? React with 👍 / 👎.
…nslothai#5750 OpenAI requires function.name to match ^[a-zA-Z0-9_-]{1,64}$ before streaming starts. The existing 64-char length check is necessary but not sufficient: MCP servers can return tool names containing '.', '/', spaces, etc. that would 400 the whole chat request. Validate the composed mcp__<server_id>__<tool> name against the regex, skip + warn on miss, and drop duplicate tool names from the same server (which would also 400 the request as "duplicates"). Also propagate the agentic-loop cancel_event into MCP tool execution so a /cancel POST during a long-running MCP call (e.g. GitHub MCP search across a large repo) actually interrupts the in-flight HTTP call instead of waiting out the 300 s timeout. The watcher polls the threading.Event at 50 ms cadence inside the asyncio loop (matches routes/inference.py's existing cancel-watcher cadence) and races against the call task with asyncio.wait FIRST_COMPLETED. Tests added: - test_mcp_specs_skip_invalid_openai_function_names: drops bad chars - test_mcp_specs_skip_empty_tool_name - test_mcp_specs_drops_duplicate_names - test_call_tool_sync_respects_pre_set_cancel_event Also fix test_desktop_auth.py's router stub that listed every existing router but missed mcp_servers_router, so importing main.py fails after this PR adds it to routes/__init__.py.
for more information, see https://pre-commit.ci
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
…nabled standalone Round 2 of cross-platform validation surfaced two more P1 findings: 1. OAuth tokens never get cleared. fastmcp keys tokens by MCP URL, not by server row, and delete / URL change / use_oauth toggle only updated the SQLite row. Re-registering the same URL would silently reuse the old account's credentials. Adds clear_oauth_tokens_async() in mcp_client.py and calls it from the delete + put route handlers when the row had use_oauth=True and either the URL changes or OAuth is turned off. 2. mcp_enabled=true was ignored unless the caller also sent enable_tools=true. The frontend always sends both together so the UI path was fine, but a direct API caller sending only mcp_enabled would silently get no MCP tools, which contradicts the field's documented "append tools from every enabled MCP server" behavior. Loosens the use_tools gate in both the GGUF and safetensors paths so mcp_enabled opens the tool loop on its own; when the caller did not also opt into built-ins, the built-in list starts empty. Tests added: - test_clear_oauth_tokens_async_no_op_safe - test_delete_server_calls_oauth_cleanup_when_oauth_was_on - test_delete_server_skips_oauth_cleanup_when_oauth_off - test_update_server_clears_oauth_on_url_change - test_update_server_clears_oauth_when_oauth_disabled 26 backend MCP tests pass; full studio/backend suite 1710 passed locally. Cross-platform CI (Linux, macOS, Windows) green on staging fork.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
for more information, see https://pre-commit.ci
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Round 3 of cross-platform validation:
1. PUT /api/mcp/servers/<id> would 500 with TypeError when the body
explicitly set is_enabled or use_oauth to null. Pydantic accepts
None for an Optional[bool] and _changes_from_payload then passed
None into mcp_servers_db.update_server, which int(None)d. Reject
explicit null at the validation layer with 400 instead.
2. POST /api/mcp/servers/test caught HTTPException under
"except Exception", so an invalid URL came back as HTTP 200 with
{"ok": false, "error": "400: ..."} instead of a real 400. The
create + update paths return 400 for the same input. Move
validation outside the transport try/except so it surfaces 400.
Tests added:
- test_changes_from_payload_rejects_null_is_enabled
- test_changes_from_payload_rejects_null_use_oauth
- test_test_endpoint_surfaces_url_validation_as_400
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
Reviewed end to end. Before this PR Studio's chat tool surface was fixed ( Validation done
What I changed on this branch I pushed three commits onto your branch covering the P1 findings I and the parallel reviewers landed on:
11 new tests cover the above. Pre-commit-ci already re-ran on the merge. Findings I considered but did not change
|
…t gate
Round 4 surfaces two more interaction bugs between the new MCP path
and existing safetensors tool plumbing:
1. OpenAI accepts ^[a-zA-Z0-9_-]{1,64}$ for function.name, and round 1
widened the MCP regex to that set, so MCP tools can now be advertised
as `mcp__srv__list-issues`. But the XML tool-call parser in
tool_call_parser.py used `\w+` (no hyphen), so the model could call
the tool but Studio could not parse the call. Same in
routes/inference.py's `_TOOL_XML_RE` stripper, which would leave
hyphenated tool-call XML in the visible content. Both regexes now
use `[\w-]+`.
2. safetensors_agentic treats `tools=[]` as "allow all" (documented
contract, exercised by test_empty_tools_list_does_not_enforce_allowlist).
When a caller sends `enable_tools=true` + `enabled_tools=[]` +
`mcp_enabled=true` and MCP discovery returns 0, the resolved tool
list is genuinely empty and built-in tools (web_search / python /
terminal) could execute via the model's emitted call. Fix at the
route gate instead of breaking the documented contract: set
`use_tools=False` when the resolved list is empty, in both GGUF and
safetensors paths. Existing callers who omit `enabled_tools` still
get ALL_TOOLS and are unaffected.
Tests added (32 total):
- test_tool_xml_parser_handles_hyphenated_function_names
- test_tool_xml_strip_handles_hyphenated_function_names
- test_safetensors_agentic_empty_allowlist_still_means_allow_all
(documents the contract round 4 preserved)
1716 passed locally; cross-platform CI on staging fork still green.
|
Round 4 pushed. Two more interaction bugs between the new MCP path and existing safetensors tool plumbing:
Tests added (32 total):
I covered findings 1, 2, 3, 4, 8, 9, 11 from the parallel reviewer aggregation. Remaining open findings I considered and left in place: loopback / private URL acceptance (registering a local MCP server is a primary use case), |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 63e4444a67
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| (_tools_on or payload.mcp_enabled) | ||
| and llama_backend.supports_tools | ||
| and not has_gguf_image |
There was a problem hiding this comment.
Honor disable-tools override in MCP gate
The new tool-loop gate uses (_tools_on or payload.mcp_enabled), which lets mcp_enabled=true bypass the process-level --disable-tools policy (_tools_on == False). In that configuration, requests still enter the server-side tool loop and can execute MCP tools, contradicting the documented tool_policy=False behavior (“forced tools off for every request”). This regression appears in both GGUF and safetensors gating logic, so deployments relying on CLI tool disablement are no longer protected from remote tool execution.
Useful? React with 👍 / 👎.
…params + cancel race
Round 5 of parallel-reviewer aggregation surfaced six additional
findings; five are real and fixed here:
1. Hyphenated MCP parameter names (`<parameter=issue-number>`) were
dropped by the XML parser's `\w+` regex. Extended to `[\w-]+` in
both core/inference/tool_call_parser.py and core/tool_healing.py.
The latter is GGUF's own copy of the parser/strip patterns and was
missed by round 4.
2. core/tool_healing.py's `strip_tool_call_markup` still used
`<function=\w+>` so hyphenated MCP tool-call XML leaked into the
GGUF visible content even after round 4 fixed the shared parser.
3+4. `mcp_enabled` re-opened the tool loop even when the operator
passed `unsloth run --disable-tools` (CLI policy False). Round 2's
`(_tools_on or payload.mcp_enabled)` gate ignored the raw process
policy. Now reads `state.tool_policy.get_tool_policy()` and gates
mcp_enabled on `_cli_policy is not False`. Applied to both GGUF
and safetensors paths.
5. GGUF's agentic loop called `execute_tool(tool_name, ...)` without
checking the model-emitted name against the per-request tool list,
while the safetensors loop already enforces this. Added the same
allow-list check so a model that hallucinates a filtered MCP name
or a built-in the caller opted out of returns "not enabled" instead
of executing.
Bonus P2 fixes:
- `call_tool_sync` now checks `cancel_event.is_set()` BEFORE
creating the call task, so a pre-set cancellation does not open
the HTTP transport.
- `clear_oauth_tokens_async` moved the OAuth import + construction
inside the protected try block; a fastmcp.client.auth load error
used to escape and 500 the delete / update route.
NOT fixed (verified false or out of scope):
- finding unslothai#10 "structured_content vs structuredContent": fastmcp's
CallToolResult dataclass uses snake_case (verified live against
structured-only tool result; fields are
`dict_keys(['content', 'structured_content', 'meta', 'data', 'is_error'])`).
- finding unslothai#11 "asyncio.run from running loop": call_tool_sync is
invoked from `asyncio.to_thread` worker threads which have no
event loop; asyncio.run() is safe there.
Tests added (37 total): hyphenated param names, tool_healing strip,
GGUF allow-list gate, cancel pre-set short-circuit, OAuth cleanup
constructor-error swallowing. 1721 passed locally, no regressions.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d537a6dee7
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| toolsEnabled: loadBool(CHAT_TOOLS_ENABLED_KEY, false), | ||
| codeToolsEnabled: loadBool(CHAT_CODE_TOOLS_ENABLED_KEY, false), | ||
| imageToolsEnabled: loadBool(CHAT_IMAGE_TOOLS_ENABLED_KEY, false), | ||
| mcpEnabledForChat: loadBool(CHAT_MCP_ENABLED_KEY, false), |
There was a problem hiding this comment.
Reset MCP toggle when clearing the active checkpoint
This new persisted flag is loaded into runtime state, but clearCheckpoint() still only resets toolsEnabled, codeToolsEnabled, and imageToolsEnabled. As a result, after a user unloads/clears a model, mcpEnabledForChat can stay true and the next local chat request will continue sending mcp_enabled: true, causing unintended MCP discovery/tool behavior even though other tool toggles were cleared. Please clear mcpEnabledForChat alongside the other tool toggles in the checkpoint-reset path.
Useful? React with 👍 / 👎.
|
Round 5 pushed. Five more interaction bugs caught by another reviewer aggregation pass, plus one piece of end-to-end browser evidence.
Tests added (5 new, 37 total in
Cross-platform CI on staging fork (Linux + macOS + Windows): green on round 5 in 1m43s. Run: https://github.com/danielhanchen/unsloth-staging-2/actions/runs/26399698777. The probe now exercises the round-5 fixes too: hyphenated parameter parsing, End-to-end browser walkthrough on an AWS B200 host: installed the PR with |
|
Earlier rounds drove the new UI against a local FastMCP server, which proves the wiring but does not prove the "remote" part of the title. Re-verified end to end against four real public MCP servers picked from the public no-auth lists (none of them owned or hosted by me):
For each, I exercised every step end to end:
Screenshots and a 7-frame walkthrough GIF of the live add-public-server flow against The earlier two probes I ran out of caution but did not change:
All four registered public servers also survive a refresh, which is what proved persistence in the first round of probing. PR head is still |
…onflicts in chat-adapter + chat-runtime-store
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
Round 6 pushed ( Conflict resolution. Main moved past the PR base (24 commits since round 5). Merge gave clean conflicts only in two frontend files, both around adjacent fields the PR adds for MCP next to fields main added for the Anthropic web_fetch pill:
After the merge: 108 MCP-relevant tests pass locally on the merged head ( End-to-end MCP dispatch through a real model. Round 1-2 evidence stopped at "tools dispatch through Registered 8 servers via the Studio UI (4 new public no-auth ones added this round). Drove one prompt per server through
Two notable things from the table:
Also drove the same flow through the actual UI: the chat thread renders the "Used tool: PR is now |
|
Animated walkthroughs of the PR in action, in case the screenshots above are easier to read as motion. 1. Add a real public MCP server end to end through the new dialog. Live probe against 2. Loaded GGUF model dispatches the new MCP tool through chat. Qwen3-4B-Instruct-2507 GGUF Q4_K_M emits a 3. Single-image six-panel summary of the manual add-server UX, in case the GIF is too quick. Each step captioned with what to click and what to expect. Assets are pinned to a commit on a staging fork orphan branch so the links don't drift if the branch is later force-pushed. |
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
unslothai#5750 added remote MCP server support, which conflicted with our import block in chat-settings-sheet.tsx. Kept both branches' imports (MCP dialog + servers API from main, ServiceTier + Input from this PR). 393/393 backend tests pass; frontend type-check + vite build clean.



Summary
/api/mcp/serversCRUD for remote MCP server configs (display name, URL, optional headers, OAuth flag);rows persist in
studio.dbmcp_enabledflag, fetch tools from every enabled server in parallel and expose themas OpenAI function tools (
mcp__<server_id>__<tool>); calls are routed back through fastmcpconnection, refresh tools, custom headers, OAuth switch)
<studio_root>/mcp-oauth-tokens/so the browser sign-in survives Studio restartsWhy
Studio's chat tool surface was fixed (
web_search,python,terminal) no way to bring in capabilities fromremote MCP servers (GitHub, Linear, Vercel, etc.) without forking the codebase.
Testing
pytest -q studio/backend/tests/test_mcp_servers.py
Manual registered a no-auth MCP server in chat settings, "Test connection" returned the tool count;
toggled "Use MCP Servers" for the chat, model invoked a server tool, response rendered with the
server · toollabel
Manual OAuth registered an OAuth-required MCP server (e.g. GitHub MCP) with
use_oauth=on, first call openedthe browser flow; killed and restarted Studio, second call did not re-prompt (token reloaded from
mcp-oauth-tokens/)API —
curl -H "Authorization: Bearer <token>" http://localhost:8888/api/mcp/serversreturns saved configs;POST /api/mcp/servers/{id}/refreshreturns{"ok": true, "tool_count": N}