Studio: harden stdio MCP gating and fix transport edge cases#5892
Conversation
- Gate the Data Recipe stdio path behind UNSLOTH_STUDIO_ALLOW_STDIO_MCP so a hosted deployment cannot spawn local processes through recipes - Enforce the gate inside _client() so the transport sink cannot spawn when disabled - keep_alive=False so stdio probes/calls do not leave orphan subprocesses - Force OAuth off for stdio servers on create and update - Drop stored headers when a server switches transport type - Reject a command whose first token is a URL scheme - Add MCP gate and improvement tests
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Code Review
This pull request introduces security gates and configuration improvements for stdio Model Context Protocol (MCP) servers, ensuring local subprocesses are only spawned when explicitly allowed on the host, and that subprocesses are torn down on exit. It also adds validation to reject command strings starting with URL schemes and normalizes OAuth and header configurations during server creation and updates. Feedback on the changes highlights an edge case in update_mcp_server where updating use_oauth to True on an existing stdio server without changing its URL could bypass the OAuth restriction, and suggests resolving the URL dynamically to enforce use_oauth = False.
| changes = _changes_from_payload(payload) | ||
| if not changes: | ||
| raise HTTPException(status_code = 400, detail = "No fields to update") | ||
| # headers == HTTP headers (remote) or env vars (stdio). On a transport-type | ||
| # switch with no new headers, drop the old ones so env secrets are not | ||
| # re-sent as HTTP headers (or vice versa). | ||
| if ( | ||
| "url" in changes | ||
| and is_stdio(changes["url"]) != is_stdio(old["url"]) |
There was a problem hiding this comment.
There is an edge case where a user can update use_oauth to True on an existing stdio server without changing the URL. Since _changes_from_payload only checks changes["url"], updating only use_oauth on a stdio server would bypass this check and persist use_oauth = True in the database. We should resolve the resulting URL using changes.get("url", old["url"]) and force use_oauth = False if it is a stdio server.
| changes = _changes_from_payload(payload) | |
| if not changes: | |
| raise HTTPException(status_code = 400, detail = "No fields to update") | |
| # headers == HTTP headers (remote) or env vars (stdio). On a transport-type | |
| # switch with no new headers, drop the old ones so env secrets are not | |
| # re-sent as HTTP headers (or vice versa). | |
| if ( | |
| "url" in changes | |
| and is_stdio(changes["url"]) != is_stdio(old["url"]) | |
| changes = _changes_from_payload(payload) | |
| if not changes: | |
| raise HTTPException(status_code = 400, detail = "No fields to update") | |
| if is_stdio(changes.get("url", old["url"])): | |
| changes["use_oauth"] = False | |
| # headers == HTTP headers (remote) or env vars (stdio). On a transport-type | |
| # switch with no new headers, drop the old ones so env secrets are not | |
| # re-sent as HTTP headers (or vice versa). | |
| if "url" in changes and is_stdio(changes["url"]) != is_stdio(old["url"]) \ | |
| and "headers_json" not in changes: | |
| changes["headers_json"] = None |
The data_designer plugin is only installed in the Studio test job, so guard the two build_mcp_providers tests with importorskip so the core matrix skips them instead of failing on ModuleNotFoundError.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 45a7342e8c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if "url" in changes and is_stdio(changes["url"]): | ||
| changes["use_oauth"] = False |
There was a problem hiding this comment.
Keep OAuth disabled for existing stdio updates
When the stored row is already a stdio command, a client can still PUT only {"use_oauth": true}; this guard only runs when the request also includes url, so changes preserves use_oauth=True and persists it. Subsequent refresh/discovery for that server pass the stored flag into probe_timeout(..., use_oauth), so the stdio probe takes the 305s OAuth timeout path instead of the intended 60s stdio path. Normalize against the effective URL (old URL unless a new one was supplied) before saving.
Useful? React with 👍 / 👎.
Follow up to #5863 (stdio MCP server support). That PR added the
UNSLOTH_STUDIO_ALLOW_STDIO_MCPgate for the chat MCP path. This tightens a few edge cases and closes one gap so the "no local stdio MCP on a hosted host" guarantee holds across all of Studio, not just the chat path.Security
build_mcp_providersand the/data-recipe/mcp/toolsroute construct aLocalStdioMCPProvider, which spawns a local subprocess, with no host gate. A recipe carried onto a hosted, Colab or network deployment could therefore still spawn local processes even when the chat path was locked down. Both now honorstdio_mcp_enabled(), and the route returns a clear "disabled on this host" result._client(). The five call sites already gate, but checking again at the single transport sink makes "disabled means cannot spawn" true by construction for any future caller.Correctness and robustness
keep_alive=FalseonStdioTransportso a one-shot probe or tool call tears the subprocess down on exit instead of leaving an orphan.use_oauth=truepushed the probe onto the 305s OAuth timeout instead of the 60s stdio timeout.ftp://,ws://,file://, a typo) with a clear 400 instead of trying to exec it. A://inside an argument is still allowed.Tests
tests/test_mcp_stdio_improvements.pycovers the new behavior:_client()self-gate, OAuth normalization, header drop on switch, URL-scheme rejection, and the Data Recipe gate.tests/test_mcp_stdio_pr5863.pyadds gate coverage for the create, test, refresh, discovery and execute paths from Studio: add stdio MCP server support #5863.