🎨 feat: MCP UI basic integration#9299
Conversation
b7fcede to
9cb70b7
Compare
mawburn
left a comment
There was a problem hiding this comment.
Looks good for the most part! A few comments:
This could be a security issue by allowing untrusted sources run iframes, without something like this:
<iframe
sandbox="allow-scripts allow-same-origin allow-forms"
src={resource.text}
/>I don't see the iframe specific code in this, but it looks like you can inject this without directly modifying those pieces.
Should we have loading states for the iframes or am I missing that?
We're missing a lot of tests, especially for the parser logic and component rendering
| } catch (error) { | ||
| console.error('Error parsing ui_resources:', error); | ||
| } |
There was a problem hiding this comment.
Should we have something to fallback to? Will this break the UI?
There was a problem hiding this comment.
I'm going to test by sending badly formatted JSON, thanks!
There was a problem hiding this comment.
@mawburn actually this is what we get with a parsing error. I'm going to also wrap the UI Resources title conditionally upon parsing success, good catch man!
| onUIAction={async (result) => { | ||
| console.log('Action:', result); | ||
| }} |
|
Hi @mawburn, thanks for the comments! I'm from the mcp-ui team, and also worked on integrating it into Shopify. Check it out here: https://mcpui.dev/guide/client/resource-renderer#security-considerations |
|
@mawburn Regarding the tests, this is what I said in the description. I first want to validate the idea by @danny-avila before taking the time to write extensive tests, so as not to waste efforts.
|
a20a338 to
7352f36
Compare
packages/api/src/mcp/parsers.ts
Outdated
| 'bedrock', | ||
| ]); | ||
| const CONTENT_ARRAY_PROVIDERS = new Set(['google', 'anthropic', 'azureopenai', 'openai']); | ||
| const CONTENT_ARRAY_PROVIDERS = new Set(['google', 'anthropic', 'openai']); |
There was a problem hiding this comment.
is removing azureopenai on purpose?
packages/api/src/mcp/parsers.ts
Outdated
| } | ||
|
|
||
| if (uiResources.length) { | ||
| currentTextBlock += `<ui_resources>${JSON.stringify(uiResources)}</ui_resources>`; |
There was a problem hiding this comment.
formattedContent is already an array of objects that have a type, couldn't we do something like
formattedContent.push({ type: 'text', text: currentTextBlock });
formattedContent.push({ type: 'ui_resources', data: uiResources });so we don't have to parse the text with regexp later?
There was a problem hiding this comment.
Thanks @sbruel, this was actually my initial approach. The issue here is that this formattedContent is sent to the LLM, and so we can only use types that the main LLMs recognize.
Here's the error message from OpenAI when using that approach:
An error occurred while processing the request:
400 Invalid value: 'resource'.
Supported values are:
'text', 'image_url', 'input_audio', 'refusal', 'audio', and 'file'.
4b0adcd to
b649e3c
Compare
b649e3c to
87b95d8
Compare
|
|
||
| // Extract ui_resources from the output to display them in the UI | ||
| let uiResources: UIResource[] = []; | ||
| if (output?.includes('ui_resources')) { |
There was a problem hiding this comment.
we probably need more solid error handling if the output includes ui_resources as text, for some reason, without being a proper ui_resource


Summary
Here's a video describing my intention.
There's a project called MCP UI. Its goal is to allow MCP server tool calls to send back not just text content to the client, but also UI elements in various forms (it could be HTML or URLs to embed as iFrames). That library also defines a way for the client and the iFrames elements to communicate through message passing.
Some popular clients like Postman (see LinkedIn announcement) and Goose by Block (see blog post announcement) have already built support.
A few weeks ago, Shopify announced agent-kit (see tweet by Tobi the CEO), which contains MCP servers serving UI resources so that any client can integrate Shopify commerce components into chat AI experiences. All the work on Shopify side to test these components was done on a branch on our local fork of LibreChat.
Now we want to make such integrations available to a wider public, so that any LibreChat user can start tinkering with MCP Servers sending back UI resources (coming from Shopify or elsewhere).
Please note that I'm first waiting to have the general idea validated before writing detailed tests.
Example of a MCP Server tool serving UI Resources as URL
If you use Anthropic's MCP inspector, you can use the following public MCP Server:
Connect to it, then go to the the tools, and make a query to
search_shop_catalogand enter the following value in both thequeryandcontextparameters: "Looking for men's sneakers in red color under $200":You'll get this as full result.
We see that the content array contains one object of type
textand then multiple objects of typeresource(see the MCP docs on Resources). Here's a sample one:{ "type": "resource", "resource": { "uri": "ui://product/gid://shopify/Product/7009816019024", "mimeType": "text/uri-list", "text": "https://mcpstorefront.com/img/storefront/product.component.html?store_domain=allbirds.com&product_handle=womens-tree-toppers-natural-black-blizzard&product_id=gid://shopify/Product/7009816019024&mode=tool" } }The way these UI Resources are identified is by their
uristarting with theui://prefix.Example of a MCP Server tool serving UI Resources as HTML
Now in Anthropic's MCP Inspector, use the following public MCP Server:
Make a query to the
get-weathertool:You'll get this as full result.
The UI Resource looks like this:
{ "type": "resource", "resource": { "uri": "ui://mcp-aharvard/weather-card", "mimeType": "text/html", "text": "<HTML CODE>" }, // ... }So we see that we can have either
mimeTypeastext/uri-listwith thetextproperty being a URL to embed, ormimeTypeastext/htmlhaving thetextproperty as HTML code to render by the client.The current behaviour
So what happens before our changes if we connect LibreChat to that MCP server and make a request to that tool?
Let's add 2 MCP servers supporting UI Resources to the Librechat.yaml configuration file:
This is what we get in the section where we see the tool call details (see this gist for the exact value):
If we replace the
\nby actual line breaks, this is what we get:This is coming from that following code in
packages/api/src/mcp/parsers.ts:LibreChat/packages/api/src/mcp/parsers.ts
Lines 143 to 161 in 3ab1bd6
What the parser is currently doing is converting the UI resource coming from the MCP tool response into text content to be sent to the LLM, which simply doesn't know what to do about it currently.
What we are changing
Now, when the backend receives such a response containing UI resources, we don't parse them as text elements with simple line breaks (which don't get rendered properly in the front-end anyway).
Instead, we accumulate them in an array, and then encode its value as b64 as an additional object in the
formattedContentarray withmetadata: 'ui_resources'in order to identify it on the frontend.This way, once the front-end receives that output, it can parse it back into UI resources more easily.
Another option we explored was to not use the text content at all but rather use the
artifactvariable, which Llangchain provides as a way to pass data down through the pipeline without sending it to the LLM and also without having to serialize and deserialize it. The logic is handled below:LibreChat/packages/api/src/mcp/parsers.ts
Lines 178 to 183 in 3ab1bd6
However, we realized that unlike what was described in the Llangchain docs, the LibreChat agents library was parsing the artifacts as text to add them to the text content which is sent to the LLM. I proposed an update for that which was rejected since that approach was used as a workaround for other purposes. Also, getting the artifacts in the frontend where we also get the content wasn't straightforward, so it seemed easier for me here to use the existing data structure rather than sending the artifacts as such to the frontend rendering code.
Once the front-end receives the text output, it extracts then parses the encoded resources and removes them from the text output:
LibreChat/client/src/components/Chat/Messages/Content/ToolCallInfo.tsx
Lines 57 to 70 in 87b95d8
It then displays the UI elements in MCP UI's
UIResourceRenderercomponent, which abstracts away the the iFrame, if there is one UI resource, or in a customUIResourceGridcomponent if there are multiple elements, which allows to display UI resources as a grid:LibreChat/client/src/components/Chat/Messages/Content/ToolCallInfo.tsx
Lines 89 to 106 in 9cb70b7
We get something like this for multiple UI resources shared as embeddable URLs:
In this current iteration, when the user clicks on a UI element that should trigger an action, we simply log the intent in the console:
And we get something like this for a single UI Resource embedded as HTML rendered by the client:
What we will do next
In a follow-up PR, we would like to:
Add to cart,Change Type
Testing
Please describe your test process and include instructions so that we can reproduce your test. If there are any important variables for your testing configuration, list them here.
Test Configuration:
Checklist
Please delete any irrelevant options.