Skip to content

Studio: harden stdio MCP gating and fix transport edge cases#5892

Merged
danielhanchen merged 3 commits into
mainfrom
studio-stdio-mcp-hardening
May 31, 2026
Merged

Studio: harden stdio MCP gating and fix transport edge cases#5892
danielhanchen merged 3 commits into
mainfrom
studio-stdio-mcp-hardening

Conversation

@danielhanchen
Copy link
Copy Markdown
Member

Follow up to #5863 (stdio MCP server support). That PR added the UNSLOTH_STUDIO_ALLOW_STDIO_MCP gate for the chat MCP path. This tightens a few edge cases and closes one gap so the "no local stdio MCP on a hosted host" guarantee holds across all of Studio, not just the chat path.

Security

  • Gate the Data Recipe stdio path. build_mcp_providers and the /data-recipe/mcp/tools route construct a LocalStdioMCPProvider, which spawns a local subprocess, with no host gate. A recipe carried onto a hosted, Colab or network deployment could therefore still spawn local processes even when the chat path was locked down. Both now honor stdio_mcp_enabled(), and the route returns a clear "disabled on this host" result.
  • Enforce the gate inside _client(). The five call sites already gate, but checking again at the single transport sink makes "disabled means cannot spawn" true by construction for any future caller.

Correctness and robustness

  • keep_alive=False on StdioTransport so a one-shot probe or tool call tears the subprocess down on exit instead of leaving an orphan.
  • Force OAuth off for stdio servers on create and update. OAuth is HTTP only and the stdio transport ignores it, but a stale use_oauth=true pushed the probe onto the 305s OAuth timeout instead of the 60s stdio timeout.
  • Drop stored headers on a transport-type switch (stdio to http or back) when the edit does not supply new ones, so stdio env vars are never re-sent as HTTP headers to a remote endpoint, and vice versa.
  • Reject a command whose first token is a URL scheme (ftp://, ws://, file://, a typo) with a clear 400 instead of trying to exec it. A :// inside an argument is still allowed.

Tests

  • tests/test_mcp_stdio_improvements.py covers the new behavior: _client() self-gate, OAuth normalization, header drop on switch, URL-scheme rejection, and the Data Recipe gate.
  • tests/test_mcp_stdio_pr5863.py adds gate coverage for the create, test, refresh, discovery and execute paths from Studio: add stdio MCP server support #5863.
  • Full MCP suite green:
python -m pytest tests/test_mcp_stdio_improvements.py tests/test_mcp_stdio_pr5863.py tests/test_mcp_servers.py
82 passed

- Gate the Data Recipe stdio path behind UNSLOTH_STUDIO_ALLOW_STDIO_MCP so a hosted deployment cannot spawn local processes through recipes
- Enforce the gate inside _client() so the transport sink cannot spawn when disabled
- keep_alive=False so stdio probes/calls do not leave orphan subprocesses
- Force OAuth off for stdio servers on create and update
- Drop stored headers when a server switches transport type
- Reject a command whose first token is a URL scheme
- Add MCP gate and improvement tests
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces security gates and configuration improvements for stdio Model Context Protocol (MCP) servers, ensuring local subprocesses are only spawned when explicitly allowed on the host, and that subprocesses are torn down on exit. It also adds validation to reject command strings starting with URL schemes and normalizes OAuth and header configurations during server creation and updates. Feedback on the changes highlights an edge case in update_mcp_server where updating use_oauth to True on an existing stdio server without changing its URL could bypass the OAuth restriction, and suggests resolving the URL dynamically to enforce use_oauth = False.

Comment on lines 171 to +179
changes = _changes_from_payload(payload)
if not changes:
raise HTTPException(status_code = 400, detail = "No fields to update")
# headers == HTTP headers (remote) or env vars (stdio). On a transport-type
# switch with no new headers, drop the old ones so env secrets are not
# re-sent as HTTP headers (or vice versa).
if (
"url" in changes
and is_stdio(changes["url"]) != is_stdio(old["url"])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is an edge case where a user can update use_oauth to True on an existing stdio server without changing the URL. Since _changes_from_payload only checks changes["url"], updating only use_oauth on a stdio server would bypass this check and persist use_oauth = True in the database. We should resolve the resulting URL using changes.get("url", old["url"]) and force use_oauth = False if it is a stdio server.

Suggested change
changes = _changes_from_payload(payload)
if not changes:
raise HTTPException(status_code = 400, detail = "No fields to update")
# headers == HTTP headers (remote) or env vars (stdio). On a transport-type
# switch with no new headers, drop the old ones so env secrets are not
# re-sent as HTTP headers (or vice versa).
if (
"url" in changes
and is_stdio(changes["url"]) != is_stdio(old["url"])
changes = _changes_from_payload(payload)
if not changes:
raise HTTPException(status_code = 400, detail = "No fields to update")
if is_stdio(changes.get("url", old["url"])):
changes["use_oauth"] = False
# headers == HTTP headers (remote) or env vars (stdio). On a transport-type
# switch with no new headers, drop the old ones so env secrets are not
# re-sent as HTTP headers (or vice versa).
if "url" in changes and is_stdio(changes["url"]) != is_stdio(old["url"]) \
and "headers_json" not in changes:
changes["headers_json"] = None

The data_designer plugin is only installed in the Studio test job, so guard
the two build_mcp_providers tests with importorskip so the core matrix skips
them instead of failing on ModuleNotFoundError.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 45a7342e8c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +157 to +158
if "url" in changes and is_stdio(changes["url"]):
changes["use_oauth"] = False
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep OAuth disabled for existing stdio updates

When the stored row is already a stdio command, a client can still PUT only {"use_oauth": true}; this guard only runs when the request also includes url, so changes preserves use_oauth=True and persists it. Subsequent refresh/discovery for that server pass the stored flag into probe_timeout(..., use_oauth), so the stdio probe takes the 305s OAuth timeout path instead of the intended 60s stdio path. Normalize against the effective URL (old URL unless a new one was supplied) before saving.

Useful? React with 👍 / 👎.

@danielhanchen danielhanchen merged commit 4b6a733 into main May 31, 2026
32 checks passed
@danielhanchen danielhanchen deleted the studio-stdio-mcp-hardening branch May 31, 2026 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants