Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
6a58f91
server: add MCP protocol type definitions
ochafik Dec 23, 2025
625437e
server: add subprocess management for MCP servers
ochafik Dec 23, 2025
e29d5c6
server: add WebSocket server implementation
ochafik Dec 23, 2025
ac9a585
server: add MCP bridge for routing WebSocket to subprocesses
ochafik Dec 23, 2025
ca5507e
server: integrate MCP support with --mcp-config option
ochafik Dec 23, 2025
91e92fc
webui: add MCP service and type definitions
ochafik Dec 23, 2025
dc1d2f9
webui: add MCP state management stores
ochafik Dec 23, 2025
78ec1f4
webui: add tool call and result display components
ochafik Dec 23, 2025
7b8e9c2
webui: add MCP server management UI
ochafik Dec 23, 2025
83b49d0
webui: integrate MCP tool calling with chat
ochafik Dec 23, 2025
17efe37
server: add MCP tests
ochafik Dec 23, 2025
e9179dd
server: add MCP documentation and example config
ochafik Dec 23, 2025
ad6ed55
webui: redesign MCP picker to match model selector style
ochafik Dec 23, 2025
70e0f90
webui: only show tool status when calling or complete
ochafik Dec 24, 2025
745fb7b
webui: move model/stats bar below tool blocks
ochafik Dec 24, 2025
19a2a91
fix: address CI failures
ochafik Dec 24, 2025
ad134cd
feat: add cwd attribute to MCP server config
ochafik Dec 24, 2025
462cc65
Update index.html.gz
ochafik Dec 24, 2025
ee9ca7a
webui: add MCP SDK dependency and handle all tool result content types
ochafik Dec 24, 2025
b92f477
chore: update webui build output
ochafik Dec 24, 2025
0257e6b
revert: remove unrelated hf_repo name change from branch
ochafik Dec 24, 2025
08b51dc
server: add --webui-mcp flag and cleanup MCP bridge
ochafik Dec 24, 2025
42558ac
fix: handle Sec-WebSocket-Protocol header in MCP WebSocket handshake
ochafik Dec 24, 2025
fc773f6
update index.html.gz
ochafik Dec 24, 2025
c5efe84
refactor: harden WebSocket server and simplify MCP types
ochafik Dec 24, 2025
6cbaaaf
chore: update webui build output
ochafik Dec 24, 2025
2e25d63
chore: drop /mcp/ws-port endpoint, simplify WS port discovery
ochafik Dec 24, 2025
4c703a2
chore: lower WS connection limit to 10 (env configurable)
ochafik Dec 24, 2025
0a6da98
docs: add subprocess.h evaluation to PR TODOs
ochafik Dec 24, 2025
07a0203
chore: update webui build output
ochafik Dec 24, 2025
cee558f
refactor: replace server-mproc with subprocess.h
ochafik Dec 24, 2025
663dfb1
fix: subprocess security hardening
ochafik Dec 24, 2025
8a5d2e5
update index.html.gz
ochafik Dec 24, 2025
771a80f
update index.html.gz
ochafik Dec 24, 2025
bcc0f97
fix: WebSocket connection leak + reuse existing SHA1/base64 libs
ochafik Dec 24, 2025
4de84d0
fix: MCP WebSocket mutex deadlock + Processing... display
ochafik Dec 24, 2025
cde3ee7
test: fix MCP unit tests and add webui_mcp flag support
ochafik Dec 24, 2025
1cf7588
test: add env var filtering test and use --mcp-config flag
ochafik Dec 24, 2025
ef9a21f
refactor: extract MCP echo server to fixtures file
ochafik Dec 24, 2025
6aacf85
debug: add console logging for MCP tool calls
ochafik Dec 24, 2025
568709a
test: use official MCP SDK in Python tests
ochafik Dec 24, 2025
4a18dc0
style: fix formatting in MCP service and chat store
ochafik Dec 24, 2025
8d82104
fix: increase WebSocket timeout to prevent MCP connection drops
ochafik Dec 24, 2025
1b159bd
refactor: reduce verbose MCP console logging
ochafik Dec 24, 2025
321ebf5
feat: add bearer token authentication to WebSocket server
ochafik Dec 24, 2025
e74ea7d
custom SSE + HTTP POST transport
ngxson Dec 25, 2025
2724d3f
Merge pull request #5 from ngxson/xsn/chafik_webui_mcp_idea
ochafik Dec 25, 2025
5e8ee5c
refactor(mcp): inline MCP bridge, use streaming HTTP proxy
ochafik Dec 25, 2025
d054762
Update index.html.gz
ochafik Dec 25, 2025
1598b35
fix: add missing mutex include for Windows, fix pyright type error
ochafik Dec 25, 2025
75b2a84
fix: remove unused sha1-ws library (WebSocket leftover)
ochafik Dec 25, 2025
386cde9
fix: restore CORS headers (Access-Control-Allow-Credentials, wildcard…
ochafik Dec 25, 2025
cc13d95
fix: remove WebSocket tests (WebSocket support was removed)
ochafik Dec 25, 2025
0b6e088
refactor: unroll header erase loop for clarity
ochafik Dec 25, 2025
ca3dae2
feat: add HTTPS support for MCP proxy
ochafik Dec 25, 2025
69b684c
feat: add fs_get_config_directory() and fs_get_config_file()
ochafik Dec 25, 2025
23d1b00
feat(mcp): restore WebSocket <-> stdio bridge for local MCP servers
ochafik Dec 25, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions common/arg.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2697,6 +2697,21 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
params.ssl_file_cert = value;
}
).set_examples({LLAMA_EXAMPLE_SERVER}).set_env("LLAMA_ARG_SSL_CERT_FILE"));
add_opt(common_arg(
{"--mcp-config"}, "FNAME",
"path to MCP server configuration JSON file",
[](common_params & params, const std::string & value) {
params.mcp_config = value;
}
).set_examples({LLAMA_EXAMPLE_SERVER}).set_env("LLAMA_ARG_MCP_CONFIG"));
add_opt(common_arg(
{"--webui-mcp"},
{"--no-webui-mcp"},
"enable MCP/WebSocket support on HTTP_PORT + 1 (default: disabled)",
[](common_params & params, bool value) {
params.webui_mcp = value;
}
).set_examples({LLAMA_EXAMPLE_SERVER}).set_env("LLAMA_ARG_WEBUI_MCP"));
add_opt(common_arg(
{"--chat-template-kwargs"}, "STRING",
string_format("sets additional params for the json template parser"),
Expand Down
56 changes: 56 additions & 0 deletions common/common.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -941,6 +941,62 @@ std::string fs_get_cache_file(const std::string & filename) {
return cache_directory + filename;
}

std::string fs_get_config_directory() {
std::string config_directory = "";
auto ensure_trailing_slash = [](std::string p) {
if (p.back() != DIRECTORY_SEPARATOR) {
p += DIRECTORY_SEPARATOR;
}
return p;
};
if (getenv("LLAMA_CONFIG")) {
config_directory = std::getenv("LLAMA_CONFIG");
} else {
#if defined(__linux__) || defined(__FreeBSD__) || defined(_AIX) || defined(__OpenBSD__)
if (std::getenv("XDG_CONFIG_HOME")) {
config_directory = std::getenv("XDG_CONFIG_HOME");
} else if (std::getenv("HOME")) {
config_directory = std::getenv("HOME") + std::string("/.config/");
} else {
#if defined(__linux__)
struct passwd *pw = getpwuid(getuid());
if ((!pw) || (!pw->pw_dir)) {
throw std::runtime_error("Failed to find $HOME directory");
}
config_directory = std::string(pw->pw_dir) + std::string("/.config/");
#else
throw std::runtime_error("Failed to find $HOME directory");
#endif
}
#elif defined(__APPLE__)
// Use ~/.llama.cpp/ for simplicity on macOS
config_directory = std::getenv("HOME") + std::string("/.llama.cpp");
#elif defined(_WIN32)
config_directory = std::getenv("APPDATA");
#elif defined(__EMSCRIPTEN__)
GGML_ABORT("not implemented on this platform");
#else
# error Unknown architecture
#endif
config_directory = ensure_trailing_slash(config_directory);
#if !defined(__APPLE__)
// On macOS, we use ~/.llama.cpp/ directly (already includes llama.cpp)
config_directory += "llama.cpp";
#endif
}
return ensure_trailing_slash(config_directory);
}

std::string fs_get_config_file(const std::string & filename) {
GGML_ASSERT(filename.find(DIRECTORY_SEPARATOR) == std::string::npos);
std::string config_directory = fs_get_config_directory();
const bool success = fs_create_directory_with_parents(config_directory);
if (!success) {
throw std::runtime_error("failed to create config directory: " + config_directory);
}
return config_directory + filename;
}

std::vector<common_file_info> fs_list(const std::string & path, bool include_directories) {
std::vector<common_file_info> files;
if (path.empty()) return files;
Expand Down
7 changes: 7 additions & 0 deletions common/common.h
Original file line number Diff line number Diff line change
Expand Up @@ -485,6 +485,10 @@ struct common_params {

std::map<std::string, std::string> default_template_kwargs;

// MCP config
std::string mcp_config = ""; // NOLINT
bool webui_mcp = false; // NOLINT

// webui configs
bool webui = true;
std::string webui_config_json;
Expand Down Expand Up @@ -661,6 +665,9 @@ bool fs_is_directory(const std::string & path);
std::string fs_get_cache_directory();
std::string fs_get_cache_file(const std::string & filename);

std::string fs_get_config_directory();
std::string fs_get_config_file(const std::string & filename);

struct common_file_info {
std::string path;
std::string name;
Expand Down
7 changes: 7 additions & 0 deletions tools/server/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,10 @@ set(TARGET_SRCS
server-http.h
server-models.cpp
server-models.h
server-ws.cpp
server-ws.h
server-mcp-stdio.cpp
server-mcp-stdio.h
server-task.cpp
server-task.h
server-queue.cpp
Expand All @@ -46,6 +50,8 @@ set(TARGET_SRCS
server-common.h
server-context.cpp
server-context.h
# SHA1 for WebSocket handshake
${CMAKE_SOURCE_DIR}/examples/gguf-hash/deps/sha1/sha1.c
)
set(PUBLIC_ASSETS
index.html.gz
Expand All @@ -69,6 +75,7 @@ install(TARGETS ${TARGET} RUNTIME)

target_include_directories(${TARGET} PRIVATE ../mtmd)
target_include_directories(${TARGET} PRIVATE ${CMAKE_SOURCE_DIR})
target_include_directories(${TARGET} PRIVATE ${CMAKE_SOURCE_DIR}/examples/gguf-hash/deps)
target_link_libraries(${TARGET} PRIVATE server-context PUBLIC common cpp-httplib ${CMAKE_THREAD_LIBS_INIT})

if (WIN32)
Expand Down
87 changes: 87 additions & 0 deletions tools/server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1679,6 +1679,93 @@ Apart from error types supported by OAI, we also have custom types that are spec
}
```

### MCP (Model Context Protocol) Support

The server supports [MCP](https://modelcontextprotocol.io/) for integrating external tools. MCP enables models to interact with external services like file systems, databases, APIs, and more.

The server acts as an HTTP proxy for remote MCP servers, handling CORS for browser-based clients.

#### MCP Configuration

Create an MCP configuration file (JSON format) with remote MCP server URLs:

```json
{
"mcpServers": {
"brave-search": {
"url": "http://127.0.0.1:38180/mcp",
"headers": {
"Authorization": "Bearer your-api-key"
}
},
"filesystem": {
"url": "http://127.0.0.1:38181/mcp"
}
}
}
```

#### MCP Configuration Location

The server looks for MCP configuration in the following order:
1. `--mcp-config` command-line argument
2. `LLAMA_MCP_CONFIG` environment variable
3. `~/.llama.cpp/mcp.json` (Linux/macOS)
4. `%APPDATA%/llama.cpp/mcp.json` (Windows)

#### MCP Usage

```bash
# Enable MCP with --webui-mcp flag
./llama-server -m model.gguf --webui-mcp

# Specify config path
./llama-server -m model.gguf --webui-mcp --mcp-config /path/to/mcp.json

# Or use environment variable
LLAMA_MCP_CONFIG=/path/to/mcp.json ./llama-server -m model.gguf --webui-mcp
```

#### MCP API Endpoints

| Endpoint | Method | Description |
|----------|--------|-------------|
| `/mcp/servers` | GET | List available MCP server names from config |
| `/mcp?server=<name>` | GET | Proxy GET requests to remote MCP server (SSE streams) |
| `/mcp?server=<name>` | POST | Proxy POST requests to remote MCP server (JSON-RPC) |

#### MCP Protocol

The server proxies requests to remote MCP servers using the [Streamable HTTP transport](https://modelcontextprotocol.io/specification/2025-11-25/basic/transports). The web UI uses the official `@modelcontextprotocol/sdk` client.

For more information about MCP, see the [Model Context Protocol documentation](https://modelcontextprotocol.io/).

#### Example MCP Servers

Here's how to run some example MCP servers that work with the default config:

**Brave Search** (requires `BRAVE_API_KEY` environment variable - get one at https://brave.com/search/api/):

```bash
BRAVE_API_KEY=your-key-here npx -y @anthropic-ai/mcp-server-brave-search --transport http --port 38180
```

**Python interpreter** (with common data science packages):

```bash
uvx mcp-run-python --deps numpy,pandas,pydantic,requests,httpx,sympy,aiohttp streamable-http --port 38181
```

**Run both together** using `concurrently`:

```bash
BRAVE_API_KEY=your-key-here npx -y concurrently \
"npx -y @anthropic-ai/mcp-server-brave-search --transport http --port 38180" \
"uvx mcp-run-python --deps numpy,pandas,pydantic,requests,httpx,sympy,aiohttp streamable-http --port 38181"
```

Then update `mcp_config.example.json` with your settings and start llama-server with `--webui-mcp`.

### Legacy completion web UI

A new chat-based UI has replaced the old completion-based since [this PR](https://github.com/ggml-org/llama.cpp/pull/10175). If you want to use the old completion, start the server with `--path ./tools/server/public_legacy`
Expand Down
41 changes: 41 additions & 0 deletions tools/server/mcp_config.example.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
{
"_comment": "Example MCP configuration for llama.cpp",
"_comment_location": "Configuration file locations (checked in order):",
"_comment_location_env": " 1. Path specified in LLAMA_MCP_CONFIG environment variable",
"_comment_location_unix": " 2. ~/.llama.cpp/mcp.json (macOS/Linux)",
"_comment_location_windows": " 3. %APPDATA%\\llama.cpp\\mcp.json (Windows)",
"_comment_types": "This file supports two types of MCP servers:",
"_comment_stdio": " - Local stdio servers: spawned and managed by llama-server (using command, args, env)",
"_comment_remote": " - Remote HTTP servers: proxied through llama-server with CORS support (using url, headers)",

"mcpServers": {
"_comment_section_stdio": "=== Local stdio servers (spawned by llama-server) ===",

"brave-search": {
"_comment": "Brave Search MCP server - provides web search capabilities",
"_comment_key": "Get your API key at https://brave.com/search/api/",
"command": "npx",
"args": ["-y", "@anthropic-ai/claude-code-mcp-brave-search"],
"env": {
"BRAVE_API_KEY": "..."
}
},

"python": {
"_comment": "Python execution MCP server - run Python code with common data science libraries",
"command": "uvx",
"args": ["mcp-run-python", "--deps", "numpy,pandas,pydantic,requests,httpx,sympy,aiohttp", "stdio"]
},

"_comment_section_remote": "=== Remote HTTP servers (proxied by llama-server) ===",

"remote-api": {
"_comment": "Example remote MCP server with authentication",
"_comment_usage": "The llama-server proxies requests to this URL and adds CORS headers",
"url": "http://127.0.0.1:38180/mcp",
"headers": {
"Authorization": "Bearer YOUR_TOKEN"
}
}
}
}
Binary file modified tools/server/public/index.html.gz
Binary file not shown.
42 changes: 38 additions & 4 deletions tools/server/server-http.cpp
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
#include "common.h"
#include "server-http.h"
#include "server-mcp.h"
#include "server-common.h"

#include <cpp-httplib/httplib.h>

#include <algorithm>
#include <functional>
#include <string>
#include <thread>
Expand Down Expand Up @@ -200,13 +202,24 @@ bool server_http_context::init(const common_params & params) {
};

// register server middlewares
srv->set_pre_routing_handler([middleware_validate_api_key, middleware_server_state](const httplib::Request & req, httplib::Response & res) {
res.set_header("Access-Control-Allow-Origin", req.get_header_value("Origin"));
srv->set_pre_routing_handler([middleware_validate_api_key, middleware_server_state, webui_mcp = params.webui_mcp](const httplib::Request & req, httplib::Response & res) {
// Get Origin header (browsers always send this)
std::string origin = req.get_header_value("Origin");
if (!origin.empty()) {
res.set_header("Access-Control-Allow-Origin", origin);
}

// If this is OPTIONS request, skip validation because browsers don't include Authorization header
if (req.method == "OPTIONS") {
res.set_header("Access-Control-Allow-Credentials", "true");
res.set_header("Access-Control-Allow-Methods", "GET, POST");
res.set_header("Access-Control-Allow-Headers", "*");
// Include MCP protocol headers for CORS only if MCP is enabled
if (webui_mcp) {
res.set_header("Access-Control-Allow-Headers", "*, mcp-session-id, mcp-protocol-version");
res.set_header("Access-Control-Expose-Headers", "mcp-session-id");
} else {
res.set_header("Access-Control-Allow-Headers", "*");
}
res.set_content("", "text/html"); // blank response, no data
return httplib::Server::HandlerResponse::Handled; // skip further processing
}
Expand Down Expand Up @@ -302,10 +315,14 @@ bool server_http_context::start() {
return true;
}

void server_http_context::stop() const {
void server_http_context::stop() {
if (pimpl->srv) {
pimpl->srv->stop();
}
// Wait for server thread to finish
if (thread.joinable()) {
thread.join();
}
}

static void set_headers(httplib::Response & res, const std::map<std::string, std::string> & headers) {
Expand Down Expand Up @@ -398,3 +415,20 @@ void server_http_context::post(const std::string & path, const server_http_conte
});
}

bool server_http_context::load_mcp_config(const std::string & config_path) {
try {
mcp_config = std::make_shared<llama_mcp_config>(config_path);
return true;
} catch (const std::exception & e) {
LOG_ERR("%s: failed to load MCP config: %s\n", __func__, e.what());
return false;
}
}

std::optional<mcp_server_config> server_http_context::get_mcp_server(const std::string & name) {
return mcp_config ? mcp_config->get_server(name) : std::nullopt;
}

std::vector<std::string> server_http_context::get_mcp_server_names() {
return mcp_config ? mcp_config->get_available_servers() : std::vector<std::string>{};
}
18 changes: 17 additions & 1 deletion tools/server/server-http.h
Original file line number Diff line number Diff line change
@@ -1,10 +1,15 @@
#pragma once

#include "server-mcp.h"

#include <atomic>
#include <functional>
#include <map>
#include <memory>
#include <mutex>
#include <string>
#include <thread>
#include <optional>

struct common_params;

Expand Down Expand Up @@ -60,12 +65,23 @@ struct server_http_context {
std::string hostname;
int port;

// MCP configuration (for HTTP proxy)
std::shared_ptr<llama_mcp_config> mcp_config;

// Load MCP config from JSON file
bool load_mcp_config(const std::string & config_path);

// Get MCP server config by name
std::optional<mcp_server_config> get_mcp_server(const std::string & name);
// Get list of available MCP server names
std::vector<std::string> get_mcp_server_names();

server_http_context();
~server_http_context();

bool init(const common_params & params);
bool start();
void stop() const;
void stop();

// note: the handler should never throw exceptions
using handler_t = std::function<server_http_res_ptr(const server_http_req & req)>;
Expand Down
Loading
Loading