Skip to content

Commit b94b514

Browse files
committed
Update mcp docs
1 parent 33b2ccb commit b94b514

File tree

1 file changed

+71
-56
lines changed

1 file changed

+71
-56
lines changed

docs/MCP.md

Lines changed: 71 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,15 @@
11
# MCP protocol support
22

3-
`mistralrs-server` can serve **MCP (Model Control Protocol)** traffic next to the regular OpenAI-compatible HTTP interface!
3+
`mistralrs-server` can speak the **MCP – Model-Control-Protocol** in addition to the regular OpenAI-compatible REST API.
44

5-
MCP is an open, tool-based protocol that lets clients interact with models through structured *tool calls* instead of free-form HTTP routes.
6-
7-
Under the hood the server uses [`rust-mcp-sdk`](https://crates.io/crates/rust-mcp-sdk) and exposes tools based on the supported modalities of the loaded model.
5+
At a high-level, MCP is an opinionated, tool-based JSON-RPC 2.0 protocol that lets clients interact with models through structured *tool calls* instead of specialised HTTP routes.
6+
The implementation in Mistral.rs is powered by [`rust-mcp-sdk`](https://crates.io/crates/rust-mcp-sdk) and automatically registers tools based on the modalities supported by the loaded model (text, vision, …).
87

98
Exposed tools:
109

1110
| Tool | Minimum `input` -> `output` modalities | Description |
1211
| -- | -- | -- |
13-
| `chat` | `Text` -> `Text` | Wraps the OpenAI `/v1/chat/completions` endpoint. |
12+
| `chat` | `Text` `Text` | Wraps the OpenAI `/v1/chat/completions` endpoint |
1413

1514

1615
---
@@ -21,27 +20,27 @@ Exposed tools:
2120
- [Running](#running)
2221
- [Check if it's working](#check-if-its-working)
2322
- [Example clients](#example-clients)
24-
- [Rust](#rust)
2523
- [Python](#python)
24+
- [Rust](#rust)
2625
- [HTTP](#http)
27-
- [Limitations](#limitations)
26+
- [Limitations \& roadmap](#limitations--roadmap)
2827

2928
---
3029

3130
## Running
3231

33-
Start the normal HTTP server and add the `--mcp-port` flag to spin up an MCP server on a separate port:
32+
Start the normal HTTP server and add the `--mcp-port` flag to expose an MCP endpoint **in parallel** on a separate port:
3433

3534
```bash
3635
./target/release/mistralrs-server \
37-
--port 1234 # OpenAI compatible HTTP API
38-
--mcp-port 4321 # MCP protocol endpoint (Streamable HTTP)
36+
--port 1234 # OpenAI-compatible REST API
37+
--mcp-port 4321 # MCP endpoint (Streamable HTTP)
3938
plain -m mistralai/Mistral-7B-Instruct-v0.3
4039
```
4140

4241
## Check if it's working
4342

44-
Run this `curl` command to check the available tools:
43+
The following `curl` command lists the tools advertised by the server and therefore serves as a quick smoke-test:
4544

4645
```
4746
curl -X POST http://localhost:4321/mcp \
@@ -56,6 +55,60 @@ curl -X POST http://localhost:4321/mcp \
5655

5756
## Example clients
5857

58+
59+
### Python
60+
61+
The [reference Python SDK](https://pypi.org/project/mcp/) can be installed via:
62+
63+
```bash
64+
pip install --upgrade mcp
65+
```
66+
67+
Here is a minimal end-to-end example that initialises a session, lists the available tools and finally sends a chat request:
68+
69+
```python
70+
import asyncio
71+
72+
from mcp import ClientSession
73+
from mcp.client.streamable_http import streamablehttp_client
74+
75+
76+
SERVER_URL = "http://localhost:4321/mcp"
77+
78+
79+
async def main() -> None:
80+
# The helper creates an SSE (Server-Sent-Events) transport under the hood
81+
async with streamablehttp_client(SERVER_URL) as (read, write, _):
82+
async with ClientSession(read, write) as session:
83+
84+
# --- INITIALIZE ---
85+
init_result = await session.initialize()
86+
print("Server info:", init_result.serverInfo)
87+
88+
# --- LIST TOOLS ---
89+
tools = await session.list_tools()
90+
print("Available tools:", [t.name for t in tools.tools])
91+
92+
# --- CALL TOOL ---
93+
resp = await session.call_tool(
94+
"chat",
95+
arguments={
96+
"messages": [
97+
{"role": "user", "content": "Hello MCP 👋"},
98+
{"role": "assistant", "content": "Hi there!"}
99+
],
100+
"maxTokens": 50,
101+
"temperature": 0.7,
102+
},
103+
)
104+
# resp.content is a list[CallToolResultContentItem]; extract text parts
105+
text = "\n".join(c.text for c in resp.content if c.type == "text")
106+
print("Model replied:", text)
107+
108+
if __name__ == "__main__":
109+
asyncio.run(main())
110+
```
111+
59112
### Rust
60113

61114
```rust
@@ -105,47 +158,6 @@ async fn main() -> Result<()> {
105158
}
106159
```
107160

108-
### Python
109-
110-
```py
111-
import asyncio
112-
from mcp import ClientSession
113-
from mcp.client.streamable_http import streamablehttp_client
114-
115-
SERVER_URL = "http://localhost:4321/mcp"
116-
117-
async def main() -> None:
118-
async with streamablehttp_client(SERVER_URL) as (read, write, _):
119-
async with ClientSession(read, write) as session:
120-
121-
# --- INITIALIZE ---
122-
init_result = await session.initialize()
123-
print("Server info:", init_result.serverInfo)
124-
125-
# --- LIST TOOLS ---
126-
tools = await session.list_tools()
127-
print("Available tools:", [t.name for t in tools.tools])
128-
129-
# --- CALL TOOL ---
130-
resp = await session.call_tool(
131-
"chat",
132-
arguments={
133-
"messages": [
134-
{"role": "user", "content": "Hello MCP 👋"},
135-
{"role": "assistant", "content": "Hi there!"}
136-
],
137-
"maxTokens": 50,
138-
"temperature": 0.7,
139-
},
140-
)
141-
# resp.content is a list[CallToolResultContentItem]; extract text parts
142-
text = "\n".join(c.text for c in resp.content if c.type == "text")
143-
print("Model replied:", text)
144-
145-
if __name__ == "__main__":
146-
asyncio.run(main())
147-
```
148-
149161
### HTTP
150162

151163
**Call a tool:**
@@ -194,9 +206,12 @@ curl -X POST http://localhost:4321/mcp \
194206
}'
195207
```
196208

197-
## Limitations
209+
## Limitations & roadmap
210+
211+
The MCP support that ships with the current Mistral.rs release focuses on the **happy-path**. A few niceties have not yet been implemented and PRs are more than welcome:
198212

199-
- Streaming requests are not implemented.
200-
- No authentication layer is provided – run the MCP port behind a reverse proxy if you need auth.
213+
1. Streaming token responses (similar to the `stream=true` flag in the OpenAI API).
214+
2. An authentication layer – if you are exposing the MCP port publicly run it behind a reverse-proxy that handles auth (e.g. nginx + OIDC).
215+
3. Additional tools for other modalities such as vision or audio once the underlying crates stabilise.
201216

202-
Contributions to extend MCP coverage (streaming, more tools, auth hooks) are welcome!
217+
If you would like to work on any of the above please open an issue first so the work can be coordinated.

0 commit comments

Comments
 (0)