Skip to content

security: make it clear about subtopics in server#18754

Merged
ngxson merged 2 commits intoggml-org:masterfrom
ngxson:xsn/security_policy_server_improve
Jan 11, 2026
Merged

security: make it clear about subtopics in server#18754
ngxson merged 2 commits intoggml-org:masterfrom
ngxson:xsn/security_policy_server_improve

Conversation

@ngxson
Copy link
Contributor

@ngxson ngxson commented Jan 11, 2026

Make it clear which parts of the server are excluded.

MCP is mentioned here because we will have stdio implementation in the future (provided by subprocess.h), as well as a CORS proxy. These features are unsafe to be used publicly, it's only intended to be used via localhost or private deployment.

@ngxson ngxson requested a review from ggerganov as a code owner January 11, 2026 11:33
@ServeurpersoCom
Copy link
Contributor

The advantage of client-side MCP is that small servers script can be provided in all languages ​​to cover all scenarios, under the user responsibility (and in most cases harmless on a local client machine, as the TCP port is not exposed to the internet). A backend proxy cannot be securely implemented if the server does not support authentication and encryption; therefore, it is better to delegate this to web servers like Apache2 and Nginx, and thus under the system administrator's responsibility
If we want to create an integrated backend service, it will need to be able to be disabled or on a separate endpoint so that it can be filtered/protected.

@ngxson
Copy link
Contributor Author

ngxson commented Jan 11, 2026

@ServeurpersoCom I don't mean to have MCP run completely on server, but just provide a simple CORS proxy to allow using MCP servers that doesn't have CORS (PoC here). But this is not very important for now, we can add it later, after the client-only implementation works correctly.

The thing I'm more concern is that we absolutely need to support stdio <--> web ui proxy at some points. This is intended to be used only on private deployment (or using web ui locally like an desktop app), configurable via mcp.json file.

If we support stdio, then MCP server w/o CORS can be defined the same way, and web ui will see them as one category, let's say backend-proxied MCP server.

@ServeurpersoCom
Copy link
Contributor

ServeurpersoCom commented Jan 11, 2026

@ServeurpersoCom I don't mean to have MCP run completely on server, but just provide a simple CORS proxy to allow using MCP servers that doesn't have CORS (PoC here). But this is not very important for now, we can add it later, after the client-only implementation works correctly.

The thing I'm more concern is that we absolutely need to support stdio <--> web ui proxy at some points. This is intended to be used only on private deployment (or using web ui locally like an desktop app), configurable via mcp.json file.

If we support stdio, then MCP server w/o CORS can be defined the same way, and web ui will see them as one category, let's say backend-proxied MCP server.

I didn't specify, but that's exactly what I was implying. Proxying / CORS! Indeed, Apache2 and Nginx are configurable reverse proxies and obviously SOTA in terms of security since many years

@ngxson
Copy link
Contributor Author

ngxson commented Jan 11, 2026

Apache2 and Nginx are configurable reverse proxies and obviously SOTA in terms of security since many years

There are 2 reasons why using apache2/nginx will not be a practical choice:

  1. The web ui MCP support is intended to be used in a kinda "desktop app" way, and asking user to configure apache2/nginx themself and no one will use it. An mcp.json should be the most optimal UX as it can be shared among other applications that also support MCP
  2. Because we will add stdio support as I said earlier (ref this PoC), the web ui will see proxied MCP servers as stdio MCP servers, there is no distinguish between stdio/proxied SSE from the perspective of REST API. So the problem here is not whether to support proxy SSE/CORS. In other words apache2/nginx is irrelevant here. The root problem is whether to provide (and how to provide) MCP endpoint in server

If we concern about security, a simple fix would be force user to set an API key if they deploy it to anything outside of 127.0.0.1

@ngxson
Copy link
Contributor Author

ngxson commented Jan 11, 2026

In other words, the MCP endpoint will be designed like this:

graph TD
    Fontend <--Custom SSE,HTTP--> Server 

    subgraph Server
        Stream_proxy
    end

    Stream_proxy <-- stdio --> Child_process
    Stream_proxy <-- SSE,HTTP,WS --> External_MCP_server
Loading

So CORS proxy is mostly the same mechanism as stdio, just different data source

@ServeurpersoCom
Copy link
Contributor

ServeurpersoCom commented Jan 11, 2026

Your architecture works fine for localhost desktop usage. But if llama-server binds beyond 127.0.0.1, Apache2 or Nginx becomes mandatory for security. They handle auth, rate limiting, SSL, and audit logs that a custom C++ API key can't match. The mcp.json UX is good but doesn't conflict with putting Apache2 in front. Default llama-server to 127.0.0.1 only, and require reverse proxy for network deployment. Apache2 can whitelist MCP destinations and enforce limits via simple config.

I don't understand why stdio (stdin/stdout/stderr -> local cmds) is mixed with CORS (network), but I trust you on this, I'll test the implementations.

@ngxson
Copy link
Contributor Author

ngxson commented Jan 11, 2026

I don't understand why stdio (stdin/stdout/stderr -> local cmds) is mixed with CORS (network), but I trust you on this, I'll test the implementations.

To put it in a simple way, inside a browser, from a domain A, if you want to interact with domain B, then domain B need to set the correct CORS header. As explained in my PoC, that's why fetch("https://huggingface.co") doesn't work from localhost website.

So the solution is a proxy to allow forwarding data (duplex) between localhost web <--> localhost server <--> destination website, while stripping or modifying the CORS proxy.

If you think it this way: stdio is just a website that doesn't allow CORS, then it's the same solution: a proxy/bridge that forward the data between subprocess's stdio to the localhost web/server

Or in other words, it's a version of the famous cgi-bin, but except that it's not in cgi-bin format (so we cannot use apache2/nginx)

@ServeurpersoCom
Copy link
Contributor

There is no technical conflict, it will work the same with a reverse proxy

@ngxson
Copy link
Contributor Author

ngxson commented Jan 11, 2026

Yeah agree, it's mostly like a built-in reverse proxy in llama-server.

In fact, we build this way mostly for using llama-server + web UI as a desktop app (same story for the router functionality), so I think the security policy that I'm writing already reflected this point, while also saying that this use case is not covered by the security program.

I think the security program on server should only target use case of public network deployment, so things like GHSA-8947-pfff-2f3c is still covered.

(If that sounds good to you, please approve this PR)

@ngxson
Copy link
Contributor Author

ngxson commented Jan 11, 2026

The mcp.json UX is good but doesn't conflict with putting Apache2 in front.

Just to clarify that you can have both mcp.json via llama-server built-in reverse proxy + Apache2 on top of llama-server, there is no problem with this.

What I want to prevent is just someone spot a problem with MCP and report them as vulnerability. As for now it's not that critical because we're considering use case of MCP to be localhost

@ServeurpersoCom
Copy link
Contributor

Of course, I approve. We'll wait for Georgi to take a look at the merge (?), because I merged the first PR quickly; it was urgent, I must say :)

@ngxson
Copy link
Contributor Author

ngxson commented Jan 11, 2026

Thanks, I'm merging it now, Georgi already commented on the other one about DoS so I think it's ok (he already acknowledged about this change)

@ngxson ngxson merged commit 4b060bf into ggml-org:master Jan 11, 2026
2 checks passed
@ggerganov
Copy link
Member

Yup, looks good.

@ServeurpersoCom About the merging - best practice is to always wait the author of the PR to do the merge themselves, even for simple changes.

@ServeurpersoCom
Copy link
Contributor

Yup, looks good.

@ServeurpersoCom About the merging - best practice is to always wait the author of the PR to do the merge themselves, even for simple changes.

I'll make a note of that! I'm used at work, where when you get a PR, it means it's completed (if not a draft), and an author must not merge himself

gary149 pushed a commit to gary149/llama-agent that referenced this pull request Jan 13, 2026
* security: make it clear about subtopics in server

* exclude DoS
dillon-blake pushed a commit to Boxed-Logic/llama.cpp that referenced this pull request Jan 15, 2026
* security: make it clear about subtopics in server

* exclude DoS
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants