Studio: add remote MCP server support by NilayYadav · Pull Request #5750 · unslothai/unsloth

NilayYadav · 2026-05-24T11:50:09Z

Summary

Add /api/mcp/servers CRUD for remote MCP server configs (display name, URL, optional headers, OAuth flag);
rows persist in studio.db
On chat send with the new mcp_enabled flag, fetch tools from every enabled server in parallel and expose them
as OpenAI function tools (mcp__<server_id>__<tool>); calls are routed back through fastmcp
New "MCP Servers" section in chat settings: per-chat enable toggle + manage dialog (add/edit/delete, test
connection, refresh tools, custom headers, OAuth switch)
OAuth tokens persisted per server under <studio_root>/mcp-oauth-tokens/ so the browser sign-in survives Studio restarts

Why

Studio's chat tool surface was fixed (web_search, python, terminal) no way to bring in capabilities from
remote MCP servers (GitHub, Linear, Vercel, etc.) without forking the codebase.

Testing

pytest -q studio/backend/tests/test_mcp_servers.py
Manual registered a no-auth MCP server in chat settings, "Test connection" returned the tool count;
toggled "Use MCP Servers" for the chat, model invoked a server tool, response rendered with the server · tool
label
Manual OAuth registered an OAuth-required MCP server (e.g. GitHub MCP) with use_oauth=on, first call opened
the browser flow; killed and restarted Studio, second call did not re-prompt (token reloaded from
mcp-oauth-tokens/)
API — curl -H "Authorization: Bearer <token>" http://localhost:8888/api/mcp/servers returns saved configs; POST /api/mcp/servers/{id}/refresh returns {"ok": true, "tool_count": N}

for more information, see https://pre-commit.ci

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ef9d341a1b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-24T11:56:08Z

+    display = server.get("display_name") or server["id"]
+    specs: list[dict] = []
+    for tool in mcp_tools:
+        name = f"{MCP_TOOL_PREFIX}{server['id']}__{tool.get('name') or ''}"


Validate MCP tool names before forwarding to OpenAI

The composed function name uses the remote MCP tool name verbatim, but no character validation is applied. If any enabled server exposes a name containing characters outside the OpenAI function-name charset (e.g. dots, spaces, colons), the whole chat completion request can fail with a 400 before streaming starts instead of skipping/normalizing that tool; this turns one incompatible remote tool into a hard failure for all tool-enabled chats.

Useful? React with 👍 / 👎.

gemini-code-assist

Code Review

This pull request introduces support for the Model Context Protocol (MCP), enabling the integration of external tools into the chat interface. Key changes include a new backend client for MCP server interaction, SQLite storage for server configurations, and FastAPI routes for managing these servers. The frontend has been updated with a management dialog and a toggle in the chat settings to enable MCP tools. Feedback focuses on optimizing performance by moving the filtering of enabled servers from Python logic into a dedicated database query method.

gemini-code-assist · 2026-05-24T11:56:18Z

+        conn.close()
+
+
+def list_servers() -> list[dict]:


Consider adding a list_enabled_servers() method to the database module. This would allow fetching only the active servers directly from the database, avoiding the overhead of fetching all servers and filtering them in Python during every chat request.

Suggested change

def list_servers() -> list[dict]:

def list_enabled_servers() -> list[dict]:

conn = get_connection()

try:

rows = conn.execute(

"SELECT * FROM mcp_servers WHERE is_enabled = 1 ORDER BY created_at"

).fetchall()

return [dict(row) for row in rows]

finally:

conn.close()

References

To improve efficiency, avoid redundant data iterations. Combine checks and transformations into a single loop or query and return computed values for callers to reuse.

gemini-code-assist · 2026-05-24T11:56:19Z

+                    or {"type": "object", "properties": {}},
+                },
+            }
+        )


Instead of fetching all servers and filtering in Python, use a dedicated database method to fetch only enabled servers. This improves efficiency, especially as the number of registered servers grows.

Suggested change

)

servers = mcp_servers_db.list_enabled_servers()

References

To improve efficiency, avoid redundant data iterations. Combine checks and transformations into a single loop and return computed values for callers to reuse.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 49e10421cc

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-24T12:11:25Z

+        return call_tool_sync(
+            url = server["url"],
+            headers = parse_server_headers(server),
+            name = tool_name,
+            args = arguments,
+            timeout = effective_timeout,
+            use_oauth = bool(server.get("use_oauth")),


Propagate cancellation into MCP tool execution

This new MCP branch ignores cancel_event, so an in-flight remote tool call cannot be interrupted when the user cancels/disconnects; the worker thread stays blocked until timeout (default up to 300s). In the tool-streaming paths, cancellation is polled between next() calls, so this blocking call delays teardown and can tie up worker capacity under slow/hung MCP servers. Please thread cancellation through the MCP call path (or use shorter cancellable waits) to match existing tool behavior.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-24T12:11:25Z

+          checked={mcpEnabledForChat}
+          onCheckedChange={setMcpEnabledForChat}
+          disabled={enabledServerCount === 0}


Allow disabling MCP toggle even when no servers are enabled

Disabling the switch whenever enabledServerCount === 0 makes it impossible to turn MCP off if it was previously enabled and the user later disables/deletes all servers. In that state, mcpEnabledForChat can remain stuck true, and the request builder still emits enable_tools: true + mcp_enabled: true, pushing chats through the tool path with no available MCP tools until the user re-enables a server just to turn the toggle off.

Useful? React with 👍 / 👎.

…test fixes

…nslothai#5750 OpenAI requires function.name to match ^[a-zA-Z0-9_-]{1,64}$ before streaming starts. The existing 64-char length check is necessary but not sufficient: MCP servers can return tool names containing '.', '/', spaces, etc. that would 400 the whole chat request. Validate the composed mcp__<server_id>__<tool> name against the regex, skip + warn on miss, and drop duplicate tool names from the same server (which would also 400 the request as "duplicates"). Also propagate the agentic-loop cancel_event into MCP tool execution so a /cancel POST during a long-running MCP call (e.g. GitHub MCP search across a large repo) actually interrupts the in-flight HTTP call instead of waiting out the 300 s timeout. The watcher polls the threading.Event at 50 ms cadence inside the asyncio loop (matches routes/inference.py's existing cancel-watcher cadence) and races against the call task with asyncio.wait FIRST_COMPLETED. Tests added: - test_mcp_specs_skip_invalid_openai_function_names: drops bad chars - test_mcp_specs_skip_empty_tool_name - test_mcp_specs_drops_duplicate_names - test_call_tool_sync_respects_pre_set_cancel_event Also fix test_desktop_auth.py's router stub that listed every existing router but missed mcp_servers_router, so importing main.py fails after this PR adds it to routes/__init__.py.

for more information, see https://pre-commit.ci

chatgpt-codex-connector · 2026-05-24T14:25:35Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

…nabled standalone Round 2 of cross-platform validation surfaced two more P1 findings: 1. OAuth tokens never get cleared. fastmcp keys tokens by MCP URL, not by server row, and delete / URL change / use_oauth toggle only updated the SQLite row. Re-registering the same URL would silently reuse the old account's credentials. Adds clear_oauth_tokens_async() in mcp_client.py and calls it from the delete + put route handlers when the row had use_oauth=True and either the URL changes or OAuth is turned off. 2. mcp_enabled=true was ignored unless the caller also sent enable_tools=true. The frontend always sends both together so the UI path was fine, but a direct API caller sending only mcp_enabled would silently get no MCP tools, which contradicts the field's documented "append tools from every enabled MCP server" behavior. Loosens the use_tools gate in both the GGUF and safetensors paths so mcp_enabled opens the tool loop on its own; when the caller did not also opt into built-ins, the built-in list starts empty. Tests added: - test_clear_oauth_tokens_async_no_op_safe - test_delete_server_calls_oauth_cleanup_when_oauth_was_on - test_delete_server_skips_oauth_cleanup_when_oauth_off - test_update_server_clears_oauth_on_url_change - test_update_server_clears_oauth_when_oauth_disabled 26 backend MCP tests pass; full studio/backend suite 1710 passed locally. Cross-platform CI (Linux, macOS, Windows) green on staging fork.

chatgpt-codex-connector · 2026-05-24T15:03:00Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

for more information, see https://pre-commit.ci

chatgpt-codex-connector · 2026-05-24T15:03:33Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

Round 3 of cross-platform validation: 1. PUT /api/mcp/servers/<id> would 500 with TypeError when the body explicitly set is_enabled or use_oauth to null. Pydantic accepts None for an Optional[bool] and _changes_from_payload then passed None into mcp_servers_db.update_server, which int(None)d. Reject explicit null at the validation layer with 400 instead. 2. POST /api/mcp/servers/test caught HTTPException under "except Exception", so an invalid URL came back as HTTP 200 with {"ok": false, "error": "400: ..."} instead of a real 400. The create + update paths return 400 for the same input. Move validation outside the transport try/except so it surfaces 400. Tests added: - test_changes_from_payload_rejects_null_is_enabled - test_changes_from_payload_rejects_null_use_oauth - test_test_endpoint_surfaces_url_validation_as_400

chatgpt-codex-connector · 2026-05-24T15:08:10Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

danielhanchen · 2026-05-24T15:14:22Z

Reviewed end to end. Before this PR Studio's chat tool surface was fixed (web_search, python, terminal); after this PR users can register remote MCP servers under chat settings and the model can call their tools as mcp__<server_id>__<tool>. This is a real feature, the diff is complete, and the existing tools/training paths are not touched.

Validation done

Full studio/backend pytest suite locally: 1710 passed, 46 skipped. No new regressions vs main; the 14 pre-existing test_training_worker_flash_attn failures reproduce on origin/main and are unrelated.
Spun up a real fastmcp.FastMCP HTTP server, registered it through the live /api/mcp/servers routes, and round-tripped list, create, test, refresh, and a tool call via execute_tool. All return the expected payloads.
Drove the chat UI in headless Chromium against the backend with the PR changes installed. Captured 14 screenshots through the full add/test/save flow at https://huggingface.co/datasets/danielhanchen/minimax-2.7-analysis/tree/main/pr5750_mcp_screenshots.
Cross-platform GitHub Actions on ubuntu-latest + macos-14 + windows-latest (Python 3.11) running the MCP unit tests + an end-to-end probe (FastMCP server, regex skip, cancel propagation, OAuth cleanup, null body, /test 400): green on all 3 OSes across 3 rounds.

What I changed on this branch

I pushed three commits onto your branch covering the P1 findings I and the parallel reviewers landed on:

mcp_specs_for_server now validates the composed mcp__<id>__<name> against OpenAI's ^[a-zA-Z0-9_-]{1,64}$ regex and skips offenders with a warning. Same path also drops empty names and duplicates. Without this an MCP server returning repo.search or read/file would 400 the entire chat request before streaming.
call_tool_sync now takes an optional cancel_event and races it against the network call so the existing /cancel POST actually interrupts a long-running MCP tool instead of waiting out the 300s timeout. execute_tool forwards the agentic-loop event.
routes/mcp_servers.delete_mcp_server and update_mcp_server now call clear_oauth_tokens_async(old_url) when the old row had OAuth on and the URL is changing or OAuth is being disabled. fastmcp keys tokens by MCP URL so without this, re-registering the same URL silently reused the old account's credentials.
mcp_enabled=true now opens the tool loop on its own (both GGUF and safetensors paths). The frontend always sends enable_tools=true alongside, but direct API callers sending only mcp_enabled previously got nothing despite the field's documented "append tools" behaviour.
PUT /api/mcp/servers/{id} with is_enabled=null or use_oauth=null was hitting int(None) -> TypeError 500. Reject explicit null at the validation layer with 400.
POST /api/mcp/servers/test was catching HTTPException under except Exception and returning 200 with {ok: false} for an invalid URL. Moved the URL/header validation outside the transport try/except so it now 400s like create + update.
The PR was branched off before Studio: strip orphan tool_call XML leaking into visible content #5735 merged, so studio/backend/tests/test_tool_xml_strip.py was missing. Merged main into the branch so it ships again. Also added mcp_servers_router to test_desktop_auth.py's router stub, which otherwise raises ImportError when studio.backend.main is imported as a package.

11 new tests cover the above. Pre-commit-ci already re-ran on the merge.

Findings I considered but did not change

_validate_url accepts loopback / RFC1918 / link-local: by design, since registering a local MCP server (e.g. http://127.0.0.1:9810/mcp/) is a primary use case. Same trust boundary as the user's other Studio actions.
McpServerResponse returns stored headers in cleartext: required so the edit dialog can re-populate the form. Only authenticated Studio users see it.
_flatten_result reads structured_content not structuredContent: verified fastmcp's CallToolResult dataclass uses snake_case, so the existing code is correct. Tool results with structured-only output also have content populated by fastmcp.
/v1/messages Anthropic Messages path does not honour mcp_enabled: AnthropicMessagesRequest has no mcp_enabled field; surfacing MCP there is a separate enhancement and out of scope for this PR.

…t gate Round 4 surfaces two more interaction bugs between the new MCP path and existing safetensors tool plumbing: 1. OpenAI accepts ^[a-zA-Z0-9_-]{1,64}$ for function.name, and round 1 widened the MCP regex to that set, so MCP tools can now be advertised as `mcp__srv__list-issues`. But the XML tool-call parser in tool_call_parser.py used `\w+` (no hyphen), so the model could call the tool but Studio could not parse the call. Same in routes/inference.py's `_TOOL_XML_RE` stripper, which would leave hyphenated tool-call XML in the visible content. Both regexes now use `[\w-]+`. 2. safetensors_agentic treats `tools=[]` as "allow all" (documented contract, exercised by test_empty_tools_list_does_not_enforce_allowlist). When a caller sends `enable_tools=true` + `enabled_tools=[]` + `mcp_enabled=true` and MCP discovery returns 0, the resolved tool list is genuinely empty and built-in tools (web_search / python / terminal) could execute via the model's emitted call. Fix at the route gate instead of breaking the documented contract: set `use_tools=False` when the resolved list is empty, in both GGUF and safetensors paths. Existing callers who omit `enabled_tools` still get ALL_TOOLS and are unaffected. Tests added (32 total): - test_tool_xml_parser_handles_hyphenated_function_names - test_tool_xml_strip_handles_hyphenated_function_names - test_safetensors_agentic_empty_allowlist_still_means_allow_all (documents the contract round 4 preserved) 1716 passed locally; cross-platform CI on staging fork still green.

danielhanchen · 2026-05-25T07:55:24Z

Round 4 pushed. Two more interaction bugs between the new MCP path and existing safetensors tool plumbing:

tool_call_parser.py and _TOOL_XML_RE in routes/inference.py used \w+ for the function-name capture. Round 1 widened the MCP regex to OpenAI's ^[a-zA-Z0-9_-]{1,64}$ so MCP tools can be advertised as mcp__srv__list-issues, but the model's emitted <function=mcp__srv__list-issues> would then fail to parse on the safetensors XML path, and the XML stripper would leave the tool call in chat history. Both regexes updated to [\w-]+.
safetensors_agentic's contract is tools=[] means "no constraint" (exercised by test_empty_tools_list_does_not_enforce_allowlist). If a caller sends enable_tools=true with enabled_tools=[] and mcp_enabled=true and MCP discovery returns 0 tools, the resolved list is genuinely empty and the model can call web_search/python/terminal even though the caller opted out of them. Fixed at the route gate (both GGUF and safetensors paths): set use_tools=False when the resolved list is empty, instead of changing the agentic-loop contract. Existing callers who omit enabled_tools still get ALL_TOOLS and are unaffected.

Tests added (32 total): test_tool_xml_parser_handles_hyphenated_function_names, test_tool_xml_strip_handles_hyphenated_function_names, test_safetensors_agentic_empty_allowlist_still_means_allow_all (documents the preserved contract).

studio/backend suite locally: 1716 passed. Cross-platform CI on staging fork (Linux + macOS + Windows) green on round 4. Updated probe at https://github.com/danielhanchen/unsloth-staging-2/actions/runs/26389767937.

I covered findings 1, 2, 3, 4, 8, 9, 11 from the parallel reviewer aggregation. Remaining open findings I considered and left in place: loopback / private URL acceptance (registering a local MCP server is a primary use case), McpServerResponse returning stored headers (needed for the edit dialog to repopulate), structured_content vs structuredContent (verified fastmcp's CallToolResult dataclass uses snake_case so the existing code is correct), /v1/messages Anthropic Messages path ignoring mcp_enabled (AnthropicMessagesRequest has no such field; out of scope), and OAuth token storage keyed by URL not server row (fastmcp's storage contract; the round-2 cleanup handler covers the practical delete / URL-change confusion).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 63e4444a67

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-25T07:56:52Z

+            (_tools_on or payload.mcp_enabled)
            and llama_backend.supports_tools
            and not has_gguf_image


Honor disable-tools override in MCP gate

The new tool-loop gate uses (_tools_on or payload.mcp_enabled), which lets mcp_enabled=true bypass the process-level --disable-tools policy (_tools_on == False). In that configuration, requests still enter the server-side tool loop and can execute MCP tools, contradicting the documented tool_policy=False behavior (“forced tools off for every request”). This regression appears in both GGUF and safetensors gating logic, so deployments relying on CLI tool disablement are no longer protected from remote tool execution.

Useful? React with 👍 / 👎.

…params + cancel race Round 5 of parallel-reviewer aggregation surfaced six additional findings; five are real and fixed here: 1. Hyphenated MCP parameter names (`<parameter=issue-number>`) were dropped by the XML parser's `\w+` regex. Extended to `[\w-]+` in both core/inference/tool_call_parser.py and core/tool_healing.py. The latter is GGUF's own copy of the parser/strip patterns and was missed by round 4. 2. core/tool_healing.py's `strip_tool_call_markup` still used `<function=\w+>` so hyphenated MCP tool-call XML leaked into the GGUF visible content even after round 4 fixed the shared parser. 3+4. `mcp_enabled` re-opened the tool loop even when the operator passed `unsloth run --disable-tools` (CLI policy False). Round 2's `(_tools_on or payload.mcp_enabled)` gate ignored the raw process policy. Now reads `state.tool_policy.get_tool_policy()` and gates mcp_enabled on `_cli_policy is not False`. Applied to both GGUF and safetensors paths. 5. GGUF's agentic loop called `execute_tool(tool_name, ...)` without checking the model-emitted name against the per-request tool list, while the safetensors loop already enforces this. Added the same allow-list check so a model that hallucinates a filtered MCP name or a built-in the caller opted out of returns "not enabled" instead of executing. Bonus P2 fixes: - `call_tool_sync` now checks `cancel_event.is_set()` BEFORE creating the call task, so a pre-set cancellation does not open the HTTP transport. - `clear_oauth_tokens_async` moved the OAuth import + construction inside the protected try block; a fastmcp.client.auth load error used to escape and 500 the delete / update route. NOT fixed (verified false or out of scope): - finding unslothai#10 "structured_content vs structuredContent": fastmcp's CallToolResult dataclass uses snake_case (verified live against structured-only tool result; fields are `dict_keys(['content', 'structured_content', 'meta', 'data', 'is_error'])`). - finding unslothai#11 "asyncio.run from running loop": call_tool_sync is invoked from `asyncio.to_thread` worker threads which have no event loop; asyncio.run() is safe there. Tests added (37 total): hyphenated param names, tool_healing strip, GGUF allow-list gate, cancel pre-set short-circuit, OAuth cleanup constructor-error swallowing. 1721 passed locally, no regressions.

for more information, see https://pre-commit.ci

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d537a6dee7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-25T11:52:31Z

  toolsEnabled: loadBool(CHAT_TOOLS_ENABLED_KEY, false),
  codeToolsEnabled: loadBool(CHAT_CODE_TOOLS_ENABLED_KEY, false),
  imageToolsEnabled: loadBool(CHAT_IMAGE_TOOLS_ENABLED_KEY, false),
+  mcpEnabledForChat: loadBool(CHAT_MCP_ENABLED_KEY, false),


Reset MCP toggle when clearing the active checkpoint

This new persisted flag is loaded into runtime state, but clearCheckpoint() still only resets toolsEnabled, codeToolsEnabled, and imageToolsEnabled. As a result, after a user unloads/clears a model, mcpEnabledForChat can stay true and the next local chat request will continue sending mcp_enabled: true, causing unintended MCP discovery/tool behavior even though other tool toggles were cleared. Please clear mcpEnabledForChat alongside the other tool toggles in the checkpoint-reset path.

Useful? React with 👍 / 👎.

danielhanchen · 2026-05-25T12:21:38Z

Round 5 pushed. Five more interaction bugs caught by another reviewer aggregation pass, plus one piece of end-to-end browser evidence.

Hyphenated MCP parameter names dropped silently. _TC_PARAM_START_RE in tool_call_parser.py still used \w+ for the parameter-name capture. MCP schemas with kebab-case keys (GitHub MCP list-issues takes repo-name, issue-number, etc.) emitted <parameter=repo-name> blocks that the parser saw as empty, so the model's tool call landed with arguments={} and the call failed at the server. Regex widened to [\w-]+ to match the function-name fix.
GGUF carries its own parser copy. core/tool_healing.py is a near-duplicate of tool_call_parser.py reused by the llama.cpp / llama-server agentic loop. Round 4 only fixed the safetensors path; the GGUF strip/parse pair still rejected hyphenated function and parameter names. Updated tool_healing.py to match: _TC_FUNC_START_RE, _TC_PARAM_START_RE, and the closing-pair stripping regex all use [\w-]+. Now both unsloth studio --frontend transformers and --frontend llama-cpp produce the same MCP behaviour.
unsloth studio --disable-tools was bypassed by mcp_enabled. _effective_enable_tools(payload) checks the CLI tool-policy override, but the MCP branch went straight to payload.mcp_enabled. An operator running unsloth studio run ... --disable-tools could still have arbitrary MCP calls fire if the chat client set mcp_enabled: true. Both route paths (routes/inference.py GGUF around line 2380 and safetensors around line 2890) now compute _mcp_allowed = bool(payload.mcp_enabled) and _cli_policy is not False before unioning into use_tools.
GGUF agentic loop ignored the per-request allow-list. When enabled_tools is set and the model emits a tool call for a tool that was filtered out (typically a stale name from system-prompt history or an MCP server that was disabled after the message was queued), the safetensors loop already short-circuits with an "Error: tool not enabled" string. The GGUF loop in llama_cpp.py (~line 5077) dispatched it directly to execute_tool, allowing built-ins like python or terminal to run even when the caller had opted out. Added the same allow-list check before invoking the tool.
call_tool_sync cancel race + OAuth import safety. Two small mcp_client.py cleanups: (a) check cancel_event.is_set() before creating the call task so a /cancel POST that landed while we were still in the asyncio.to_thread queue does not open a fresh HTTP connection; (b) move the optional fastmcp.client.auth.OAuth import inside the same try/except as the call so a missing optional dependency surfaces as a clean "Error:" string in chat instead of a 500.

Tests added (5 new, 37 total in test_mcp_servers.py): test_tool_xml_param_parser_handles_hyphens, test_tool_healing_handles_hyphenated_xml, test_mcp_enabled_respects_cli_disable_policy, test_gguf_agentic_blocks_disabled_tool, test_call_tool_sync_short_circuits_on_pre_set_cancel, plus the test_clear_oauth_tokens_swallows_constructor_errors regression.

studio/backend MCP suites locally: 108 passed (test_mcp_servers.py 72, test_tool_xml_strip.py 22, test_safetensors_tool_loop.py 14). Full backend suite: 1746 passed outside the three test files that depend on host terminal width and Windows-specific GPU resolution (test_studio_api.py::test_help_output, test_training_worker_flash_attn.py, test_windows_gpu_detection_mock.py) which fail the same way on main.

Cross-platform CI on staging fork (Linux + macOS + Windows): green on round 5 in 1m43s. Run: https://github.com/danielhanchen/unsloth-staging-2/actions/runs/26399698777. The probe now exercises the round-5 fixes too: hyphenated parameter parsing, tool_healing strip+parse parity, tool_policy CLI override, and the cancel pre-set short-circuit against a live FastMCP server.

End-to-end browser walkthrough on an AWS B200 host: installed the PR with UNSLOTH_STUDIO_HOME and a baseline at main in parallel, drove both through the chat settings flow with Playwright, and captured before/after screenshots and a 7-frame walkthrough GIF of the add-MCP-server journey (open settings, expand MCP section, open dialog, fill display name + URL pointing at a local FastMCP server, "Test connection" returns "Connected (3 tools)", save, server persists after refresh). Side-by-side comparisons and the GIF are in pr5750_before_after_comparison/ of the dataset alongside the raw before/after PNGs.

danielhanchen · 2026-05-25T13:26:06Z

Earlier rounds drove the new UI against a local FastMCP server, which proves the wiring but does not prove the "remote" part of the title. Re-verified end to end against four real public MCP servers picked from the public no-auth lists (none of them owned or hosted by me):

Server	URL	Tools	Notes
DeepWiki	`https://mcp.deepwiki.com/mcp`	3	`read_wiki_structure`, `read_wiki_contents`, `ask_question`
Context7	`https://mcp.context7.com/mcp`	2	`resolve-library-id`, `query-docs` (hyphenated, exercises round 4 + 5 regex fixes)
Roundtable	`https://mcp.roundtable.now/mcp`	13	All hyphenated (`list-models`, `consult-council`, `design-architecture`, etc.)
GitMCP (unslothai/unsloth)	`https://gitmcp.io/unslothai/unsloth`	4	`fetch_unsloth_documentation`, `search_unsloth_code`, etc.

For each, I exercised every step end to end:

POST /api/mcp/servers/test -- all four returned {"ok": true, "tool_count": N} with the expected counts (3 / 2 / 13 / 4).
POST /api/mcp/servers/ -- persisted all four to studio.db with stable IDs.
Manage MCP Servers dialog -- UI rendered all four entries with enable toggles, refresh, edit, delete buttons; the chat-settings panel showed "4 servers enabled -- Manage..." underneath the "Use MCP Servers" master toggle.
Added a fifth (https://gitmcp.io/unslothai/unsloth_zoo) through the dialog; "Test connection" returned the "Connected (4 tools)" toast against the live server within ~2 s; "Add server" persisted it; "MCP server added" toast appeared; dialog refreshed to five rows.
execute_tool round-trip through Studio's dispatcher for one tool on each server, including the hyphenated mcp__context7__resolve-library-id and mcp__roundtable__list-models -- all returned actual content from the upstream MCP servers (DeepWiki gave a coherent answer about Unsloth, GitMCP returned the README, Context7 resolved the fastapi library ID, Roundtable listed its models). This is the part round 4 + 5 fixed for hyphenated tool names.

Screenshots and a 7-frame walkthrough GIF of the live add-public-server flow against gitmcp.io are in pr5750_public_mcp_servers/ of the same dataset. Public-server URLs are nothing exotic; they were picked from publicly maintained "no-auth remote MCP" lists (sylviangth/awesome-remote-mcp-servers, mcpservers.org) so anyone wanting to reproduce the verification can hit the same endpoints.

The earlier two probes I ran out of caution but did not change:

Semgrep MCP (https://mcp.semgrep.ai/mcp) -- returns 401 without an auth token, exercising the round 3 /test 400 surface for an unreachable target. Did not register.
402.bot (https://api.402.bot/mcp) -- TLS handshake fails from this host with TLSV1_ALERT_INTERNAL_ERROR; their endpoint is currently down for an unrelated reason. Did not register.

All four registered public servers also survive a refresh, which is what proved persistence in the first round of probing. PR head is still d537a6dee7; the verification was against the same backend that staging CI confirmed green for Linux + macOS + Windows.

…onflicts in chat-adapter + chat-runtime-store

chatgpt-codex-connector · 2026-05-27T07:24:38Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

danielhanchen · 2026-05-27T07:38:10Z

Round 6 pushed (3726238). Two things this round.

Conflict resolution. Main moved past the PR base (24 commits since round 5). Merge gave clean conflicts only in two frontend files, both around adjacent fields the PR adds for MCP next to fields main added for the Anthropic web_fetch pill:

studio/frontend/src/features/chat/api/chat-adapter.ts -- destructured mcpEnabledForChat from this PR collides with webFetchToolsEnabled from main; kept both.
studio/frontend/src/features/chat/stores/chat-runtime-store.ts -- five collision blocks for the localStorage key constant, the type field, the action signature, the initial state, and the action implementation; kept the PR side and the main side side-by-side in each block (the two features are independent).

After the merge: 108 MCP-relevant tests pass locally on the merged head (test_mcp_servers.py 72, test_tool_xml_strip.py 22, test_safetensors_tool_loop.py 14). Cross-platform CI on staging fork re-ran green on ubuntu-latest + macos-14 + windows-latest in 2m15s on the merged tree.

End-to-end MCP dispatch through a real model. Round 1-2 evidence stopped at "tools dispatch through execute_tool". This round drove the whole loop through a real GGUF model (Qwen3-4B-Instruct-2507-GGUF Q4_K_M, served by llama-server via the GGUF agentic loop in llama_cpp.py) and confirmed every step.

Registered 8 servers via the Studio UI (4 new public no-auth ones added this round). Drove one prompt per server through POST /api/inference/chat/completions with mcp_enabled: true, tool_choice: "required". For each, the SSE stream produced a tool_end event with the upstream content, and the model's follow-up turn quoted the real response:

Prompt	Tool fired	Result bytes	Elapsed
Fetch Unsloth docs (GitMCP)	`mcp__<id>__fetch_unsloth_documentation`	18,495	5.4 s
Ask DeepWiki about Studio's GGUF backend	`mcp__<id>__ask_question`	2,085	19.8 s
Context7 resolve fastapi	`mcp__<id>__resolve-library-id`	1,616	4.8 s
MS Learn Azure AI Foundry search	`mcp__<id>__microsoft_docs_search`	19,678	5.4 s
Cloudflare docs `wrangler deploy` search	`mcp__<id>__search_cloudflare_documentation`	12,879	4.8 s
Hugging Face `qwen3` model search	`mcp__<id>__hub_repo_search`	11,554	3.6 s

Two notable things from the table:

mcp__<id>__resolve-library-id is the literal hyphenated-name regex case that round 4 + 5 fixed. The model emitted <tool_call>{"name": "mcp__<id>__resolve-library-id", ...}</tool_call>, the parser accepted it, the dispatcher routed it, the upstream returned, and the model quoted /fastapi/fastapi back. Without the regex widening these would either lose all parameters or skip the call entirely.
The dispatch logs include tool_status heartbeats during the upstream call -- exactly what the cancel-watcher path consumes, so the round 1 / round 5 cancel-propagation work has live exercise too.

Also drove the same flow through the actual UI: the chat thread renders the "Used tool: <server_id> . fetch_unsloth_documentation" card, then the model's follow-up answer is composed from the real upstream content (Unsloth README rendered as a clean one-sentence summary, including the "70% less VRAM and 2x faster training" claim that only appears in the upstream README, not in the model's training data). 220 tok/s on a B200, context bar shows 16.2k / 32.8k after the MCP fold-in.

PR is now MERGEABLE again at head 3726238.

danielhanchen · 2026-05-27T11:33:41Z

Animated walkthroughs of the PR in action, in case the screenshots above are easier to read as motion.

1. Add a real public MCP server end to end through the new dialog. Live probe against https://gitmcp.io/unslothai/unsloth_zoo; the "Connected (4 tools)" toast is the real upstream list_tools response, not a stub.

2. Loaded GGUF model dispatches the new MCP tool through chat. Qwen3-4B-Instruct-2507 GGUF Q4_K_M emits a <tool_call> for mcp__<server_id>__fetch_unsloth_documentation; Studio renders the "Used tool" pill and folds the upstream response into the model's follow-up turn. The one-sentence summary at the end is composed from the actual GitMCP response (the "70% less VRAM and 2x faster training" phrasing is from the live Unsloth README, not in the model's training corpus).

3. Single-image six-panel summary of the manual add-server UX, in case the GIF is too quick. Each step captioned with what to click and what to expect.

Assets are pinned to a commit on a staging fork orphan branch so the links don't drift if the branch is later force-pushed.

chatgpt-codex-connector · 2026-05-27T11:39:38Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

unslothai#5750 added remote MCP server support, which conflicted with our import block in chat-settings-sheet.tsx. Kept both branches' imports (MCP dialog + servers API from main, ServiceTier + Input from this PR). 393/393 backend tests pass; frontend type-check + vite build clean.

NilayYadav added 3 commits May 24, 2026 15:45

added remote MCP server support

d7d6f41

trim

9174da0

added tests

ef9d341

NilayYadav requested review from danielhanchen and rolandtannous as code owners May 24, 2026 11:50

[pre-commit.ci] auto fixes from pre-commit.com hooks

662dc8a

for more information, see https://pre-commit.ci

chatgpt-codex-connector Bot reviewed May 24, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 24, 2026

View reviewed changes

increased timeout

49e1042

chatgpt-codex-connector Bot reviewed May 24, 2026

View reviewed changes

NilayYadav and others added 4 commits May 24, 2026 17:55

disabling MCP chat toggle

3029469

Merge main into pr-5750: pull unslothai#5735 test_tool_xml_strip + la…

3fb2c8d

…test fixes

[pre-commit.ci] auto fixes from pre-commit.com hooks

602d75d

for more information, see https://pre-commit.ci

[pre-commit.ci] auto fixes from pre-commit.com hooks

4ddbbce

for more information, see https://pre-commit.ci

chatgpt-codex-connector Bot reviewed May 25, 2026

View reviewed changes

danielhanchen and others added 2 commits May 25, 2026 11:41

[pre-commit.ci] auto fixes from pre-commit.com hooks

d537a6d

for more information, see https://pre-commit.ci

chatgpt-codex-connector Bot reviewed May 25, 2026

View reviewed changes

Merge main into pr-5750 round 6: resolve MCP vs Anthropic web_fetch c…

3726238

…onflicts in chat-adapter + chat-runtime-store

Merge branch 'main' into mcp-servers

2c87ded

danielhanchen merged commit 9a907a8 into unslothai:main May 27, 2026
43 of 45 checks passed

danielhanchen mentioned this pull request May 27, 2026

Studio: add Codex SDK as a chat provider with parallel-calls fan-out #5724

Open

4 tasks

oobabooga mentioned this pull request May 29, 2026

Studio: add stdio MCP server support #5863

Merged

-def list_servers() -> list[dict]:
+def list_enabled_servers() -> list[dict]:
+    conn = get_connection()
+    try:
+        rows = conn.execute(
+            "SELECT * FROM mcp_servers WHERE is_enabled = 1 ORDER BY created_at"
+        ).fetchall()
+        return [dict(row) for row in rows]
+    finally:
+        conn.close()

Uh oh!

Conversation

NilayYadav commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Testing

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot commented May 24, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 24, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 24, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 24, 2026

Uh oh!

danielhanchen commented May 24, 2026

Uh oh!

danielhanchen commented May 25, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

danielhanchen commented May 25, 2026

Uh oh!

danielhanchen commented May 25, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 27, 2026

Uh oh!

danielhanchen commented May 27, 2026

Uh oh!

danielhanchen commented May 27, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

NilayYadav commented May 24, 2026 •

edited

Loading