Superset Electron MCP built on Puppeteer#1481
Conversation
Generic package that enables AI agents to interact with the running Electron app: take screenshots, inspect DOM, click elements, type text, send keyboard shortcuts, read console logs, and evaluate JS. Two roles in one package: - Library export: any Electron app imports createAutomationServer() in main process to start an HTTP automation server (dev-only) - MCP binary: standalone stdio server translating MCP tool calls into HTTP requests to the automation server Tools: take_screenshot, inspect_dom, click, type_text, send_keys, get_console_logs, evaluate_js, navigate, get_window_info
…-core Replace the Express HTTP server + manual JS execution architecture with direct CDP (Chrome DevTools Protocol) connection via puppeteer-core. This fixes issues with Radix UI interactions and simplifies the overall design. Architecture: Claude Code ←stdio→ MCP Server ←CDP WebSocket→ Electron Key changes: - Add --remote-debugging-port and --remote-allow-origins flags to Electron - New ConnectionManager for lazy CDP connection with auto-reconnect - New FocusLock to suppress blur events between MCP tool calls - New ConsoleCapture using CDP console events instead of preload injection - All 9 tools rewritten to use puppeteer Page API directly - Remove Express server, HTTP client, and all server/ handlers - Swap express dep for puppeteer-core (Bun + Playwright is incompatible) - Set defaultViewport: null to preserve Electron window size
📝 WalkthroughWalkthroughA new desktop automation MCP server package is introduced, enabling remote control of the Electron app via CDP with tools for DOM inspection, element interaction, console logging, and code execution, integrated into the MCP registry with environment-based port configuration. Changes
Sequence DiagramsequenceDiagram
participant Startup as Electron Startup
participant MCP as MCP Server<br/>(stdio)
participant CM as ConnectionManager
participant Puppeteer as Puppeteer
participant Renderer as Electron<br/>Renderer
participant Tool as Tool<br/>Handler
Startup->>MCP: createMcpServer()
MCP->>CM: new ConnectionManager()
MCP->>MCP: registerTools(ToolContext)
MCP->>MCP: await server.connect(transport)
Note over MCP: Server ready & listening on stdio
rect rgba(0, 150, 255, 0.5)
Note over MCP,Renderer: Client sends tool request (e.g., "click")
MCP->>Tool: Tool handler invoked
Tool->>CM: getPage()
CM->>Puppeteer: browser.connectOverCDP()
Puppeteer->>Renderer: Connect to renderer process
Puppeteer-->>CM: Page object
CM-->>Tool: Page object
Tool->>Renderer: Execute action (click, type, etc.)
Renderer-->>Tool: Result/response
Tool-->>MCP: Formatted MCP response
MCP-->>MCP: Send response via stdio
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🧹 Preview Cleanup CompleteThe following preview resources have been cleaned up:
Thank you for your contribution! 🎉 |
There was a problem hiding this comment.
Actionable comments posted: 9
🤖 Fix all issues with AI agents
In `@apps/desktop/src/lib/electron-app/factories/app/setup.ts`:
- Around line 86-91: Replace the raw process.env usage with the validated env
import and update the comment: use env.DESKTOP_AUTOMATION_PORT (already
validated/coerced in env.shared) instead of process.env.DESKTOP_AUTOMATION_PORT
when building cdpPort in the setup block that calls
app.commandLine.appendSwitch, and change the comment text from "playwright-core"
to "puppeteer-core" to reflect the correct automation library.
In `@packages/desktop-mcp/src/mcp/connection/connection-manager.ts`:
- Around line 21-27: The getPage() method has a race where concurrent calls can
both detect no page and call connect() concurrently; introduce a connection
guard (e.g., this.connectingPromise) inside the ConnectionManager so that
getPage() sets this.connectingPromise = this.connect() when no existing promise
exists, awaits that single promise for all callers, and then clears the guard on
success or failure; ensure you still call this.focusLock.inject(this.page) after
the awaited promise resolves and clear the connectingPromise in a finally block
to avoid stalling future attempts.
In `@packages/desktop-mcp/src/mcp/console-capture/console-capture.ts`:
- Around line 33-48: getLogs currently assigns filtered = this.logs which
returns a direct reference to the internal array; change getLogs (the function
named getLogs returning ConsoleLogEntry[]) to always return a copy instead of
the original (e.g., initialize filtered with a shallow copy like
this.logs.slice() or [...this.logs]) so callers never receive a reference to the
internal this.logs; keep the existing filtering by level and slicing for limit
but ensure the initial copy is used.
In `@packages/desktop-mcp/src/mcp/dom-inspector/dom-inspector.ts`:
- Around line 45-47: The code builds cssSelector with the raw testId which can
break the attribute selector when testId contains quotes/backslashes; update the
branch that sets cssSelector (the else if handling testId) to use
CSS.escape(testId) the same way node.id is handled (i.e., build the selector
using '[data-testid="' + CSS.escape(testId) + '"]'). This ensures the generated
selector is properly escaped and robust.
In `@packages/desktop-mcp/src/mcp/focus-lock/focus-lock.ts`:
- Around line 57-65: The "load" event handler in attach (page.on("load", async
() => { ... })) can cause unhandled promise rejections because this.inject(page)
is async and may throw; update the handler to catch and handle errors from
inject(page) (either wrap the body in try/catch or call
this.inject(page).catch(...)) and log or swallow the error appropriately,
keeping the existing semantics of setting this.locked = false and re-injecting
when this.timeout is truthy; reference the attach method, the page.on("load",
...) handler, and the inject(page) call.
In `@packages/desktop-mcp/src/mcp/tools/click/click.ts`:
- Around line 14-16: The selector building in the branch using testId
interpolates the raw testId into a CSS selector
(document.querySelectorAll('[data-testid="' + testId + '"]')[index]) which
breaks for values containing quotes and other special characters; update that
branch in click.ts to escape testId using CSS.escape (or otherwise safely build
the attribute selector) before passing it to document.querySelectorAll so the
selector becomes robust for values like foo"bar, and keep using the same index
lookup logic.
In `@packages/desktop-mcp/src/mcp/tools/evaluate-js/evaluate-js.ts`:
- Around line 14-39: The handler that calls page.evaluate may set
content[0].text to the primitive undefined when the evaluated code returns
undefined; update the logic in the async (args) => { ... } block (the code that
computes result from page.evaluate) so content[0].text is always a string—e.g.
treat undefined explicitly (map undefined to the string "undefined"), otherwise
keep strings as-is and JSON.stringify non-string values; change the expression
that builds text (the ternary using typeof result and JSON.stringify) to first
check for result === undefined and convert it to "undefined" before falling back
to JSON.stringify for objects.
In `@packages/desktop-mcp/src/mcp/tools/navigate/navigate.ts`:
- Around line 21-22: The code passes args.url directly into page.goto causing
potential SSRF/unsafe schemes; update the navigate logic that uses args.url and
page.goto to validate and sanitize the URL first: parse args.url with the URL
constructor (or equivalent), ensure the protocol is either "http:" or "https:",
reject or throw/log and skip navigation for any other scheme (e.g., file:,
javascript:, data:), and only call page.goto when validation succeeds so
navigation is restricted to safe http(s) targets.
In `@packages/desktop-mcp/src/mcp/tools/type-text/type-text.ts`:
- Around line 29-33: The clear-first block uses the macOS-only "Meta" modifier
so selection fails on Windows/Linux; update the args.clearFirst logic to pick
the modifier based on platform (e.g., use 'Meta' when process.platform ===
"darwin" and 'Control' otherwise) and then call page.keyboard.down(modifierKey),
page.keyboard.press("a"), page.keyboard.up(modifierKey) using the chosen
modifier; ensure you reference the existing args.clearFirst check and the
existing page.keyboard.down/press/up calls when making the change.
🧹 Nitpick comments (10)
packages/desktop-mcp/src/mcp/tools/navigate/navigate.ts (1)
10-16: Input schema allows bothurlandpathsimultaneously — consider requiring exactly one.Both fields are optional, and providing both silently ignores
path. A Zod refinement (orz.union/z.discriminatedUnion) would make the intent explicit and prevent ambiguous inputs.packages/desktop-mcp/src/mcp/tools/send-keys/send-keys.ts (2)
56-89:args.keysis typed asunknown— prefer narrowing over repeatedas string[]casts.The
argsparameter from the MCP SDK callback is loosely typed. Instead of castingargs.keys as string[]in two places (Lines 58 and 85), parse/narrow once at the top of the handler and reuse the typed result throughout.Proposed fix
async (args) => { - const page = await getPage(); - const keys = (args.keys as string[]).map(normalizeKey); + const page = await getPage(); + const rawKeys = args.keys as string[]; + const keys = rawKeys.map(normalizeKey); const modifiers = keys.filter((k) => MODIFIER_KEYS.has(k)); const nonModifiers = keys.filter((k) => !MODIFIER_KEYS.has(k)); ... return { content: [ { type: "text" as const, - text: `Sent keys: ${(args.keys as string[]).join("+")}`, + text: `Sent keys: ${rawKeys.join("+")}`, }, ], };
77-78:Array.prototype.reverse()mutates in-place — safe here but consider.toReversed().
modifiers.reverse()mutates themodifiersarray. It's safe becausemodifiersisn't used afterwards, but.toReversed()(ES2023) would make the intent clearer and prevent subtle bugs if code is later added below.packages/desktop-mcp/src/mcp/console-capture/console-capture.ts (1)
18-31: Repeatedattach()calls will stack duplicate listeners on the sameConsoleCaptureinstance.Since
ConsoleCaptureis a singleton onConnectionManager(seeconnection-manager.tsLine 18), each reconnect callsattach()again. While the oldPageobject is discarded, there's no guard against accidental double-attach on the same page. Consider adding a guard or removing the previous listener.Also,
this.logs.shift()on Line 29 is O(n) per insertion when at capacity. For a 500-item cap this is likely fine in practice, but if the cap ever increases, a circular buffer would be preferable.Suggested guard against re-attachment
export class ConsoleCapture { private logs: ConsoleLogEntry[] = []; private maxSize = 500; + private attached = false; attach(page: Page) { + if (this.attached) return; + this.attached = true; page.on("console", (msg: ConsoleMessage) => {Actually, since reconnects produce new
Pageinstances and you want to capture from the new page, a better approach might be to track the current page and re-wire:Alternative: re-wire to new page
export class ConsoleCapture { private logs: ConsoleLogEntry[] = []; private maxSize = 500; + private handler: ((msg: ConsoleMessage) => void) | null = null; + private currentPage: Page | null = null; attach(page: Page) { + // Detach from previous page if any + if (this.currentPage && this.handler) { + this.currentPage.off("console", this.handler); + } + this.currentPage = page; - page.on("console", (msg: ConsoleMessage) => { + this.handler = (msg: ConsoleMessage) => { const level = LEVEL_MAP[msg.type()] ?? 1; ... - }); + }; + page.on("console", this.handler); }packages/desktop-mcp/src/mcp/focus-lock/focus-lock.ts (1)
85-89: Silent catch — add a debug log per project conventions.The
catchblock silently swallows errors. Per project guidelines, errors should at minimum be logged with context. This aids debugging when the page is unexpectedly destroyed.Proposed fix
try { await page.evaluate(UNLOCK_SCRIPT); - } catch { - // page may have navigated or been destroyed + } catch (err) { + console.debug("[focus-lock/unlock] page may have navigated or been destroyed", err); }Based on learnings: "Use prefixed console logging with consistent context pattern: [domain/operation] message for entry/exit of significant operations, external API calls, and error conditions"
packages/desktop-mcp/src/mcp/connection/connection-manager.ts (1)
29-58: Good structure for the connection lifecycle.The lazy-connect + disconnect-handler pattern is clean. The page filtering (skipping
chrome-extension://) and re-injection ofConsoleCapture/FocusLockon connect are well thought out.Two minor observations:
- Magic numbers:
60_000(Line 32) could be a named constant (e.g.,PROTOCOL_TIMEOUT_MS) alongside the existingCDP_PORTfor consistency.- Error context on connect failure: If
puppeteer.connectfails (Electron not running), the raw puppeteer error may be opaque. Consider wrapping with a[desktop-mcp/connect]prefixed message.packages/desktop-mcp/src/zod.ts (1)
41-45: Type asymmetry betweenConsoleLogsRequestSchema.level(string enum) andConsoleLogEntrySchema.level(number).The request schema uses
z.enum(["log", "warn", "error", "info", "debug"])for level, whileConsoleLogEntrySchema(Line 111) stores level asz.number(). The tool handler must map between these representations. This is workable, but could be a source of bugs if a new level string is added to the request enum without updating the numeric mapping inconsole-capture.ts.Consider co-locating the level mapping (e.g., a shared
LEVEL_MAPor a shared enum) so changes stay in sync.packages/desktop-mcp/src/bin.ts (1)
5-7: Add error handling and graceful shutdown.A bare top-level
await server.connect(transport)will produce an unhandled rejection with no context if the connection fails. There's also no cleanup on process signals, so CDP connections may leak when the process is killed.Suggested improvement
-const server = createMcpServer(); -const transport = new StdioServerTransport(); -await server.connect(transport); +const server = createMcpServer(); +const transport = new StdioServerTransport(); + +process.on("SIGINT", async () => { + await server.close(); + process.exit(0); +}); +process.on("SIGTERM", async () => { + await server.close(); + process.exit(0); +}); + +await server.connect(transport).catch((err) => { + console.error("[desktop-mcp/bin] failed to start:", err); + process.exit(1); +});As per coding guidelines: "Never swallow errors silently; at minimum log errors with context before rethrowing or handling them explicitly" and "Use prefixed console logging with consistent context pattern: [domain/operation] message."
packages/desktop-mcp/src/mcp/tools/type-text/type-text.ts (1)
22-46: No error handling around page interactions.Other tools like
evaluate_jswrap page calls in try/catch and returnisError: trueon failure. This tool will throw unhandled ifpage.click(selector)fails (e.g., selector not found). Consider adding a try/catch for consistency with the other tools.packages/desktop-mcp/src/mcp/tools/click/click.ts (1)
73-127: Missing error handling for the element-find-and-click path.Unlike
evaluate_js, this tool doesn't wrap page interactions in try/catch. Ifpage.evaluateorpage.mouse.clickthrows (e.g., navigation mid-click, detached frame), the error surfaces as an unhandled exception rather than a structured MCP error response. Consider wrapping in try/catch and returningisError: truefor consistency.
| // Enable CDP for desktop automation MCP (playwright-core connects via this port) | ||
| if (env.NODE_ENV === "development") { | ||
| const cdpPort = String(process.env.DESKTOP_AUTOMATION_PORT || 9223); | ||
| app.commandLine.appendSwitch("remote-debugging-port", cdpPort); | ||
| app.commandLine.appendSwitch("remote-allow-origins", "*"); | ||
| } |
There was a problem hiding this comment.
Use the parsed env object instead of raw process.env.
Line 88 reads process.env.DESKTOP_AUTOMATION_PORT directly, bypassing the Zod-validated env already imported on line 7 and used elsewhere in this file. This duplicates the default 9223 (already in env.shared.ts) and skips coercion/validation. Also, the comment says "playwright-core" but this PR uses puppeteer-core.
Proposed fix
-// Enable CDP for desktop automation MCP (playwright-core connects via this port)
+// Enable CDP for desktop automation MCP (puppeteer-core connects via this port)
if (env.NODE_ENV === "development") {
- const cdpPort = String(process.env.DESKTOP_AUTOMATION_PORT || 9223);
+ const cdpPort = String(env.DESKTOP_AUTOMATION_PORT);
app.commandLine.appendSwitch("remote-debugging-port", cdpPort);
app.commandLine.appendSwitch("remote-allow-origins", "*");
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Enable CDP for desktop automation MCP (playwright-core connects via this port) | |
| if (env.NODE_ENV === "development") { | |
| const cdpPort = String(process.env.DESKTOP_AUTOMATION_PORT || 9223); | |
| app.commandLine.appendSwitch("remote-debugging-port", cdpPort); | |
| app.commandLine.appendSwitch("remote-allow-origins", "*"); | |
| } | |
| // Enable CDP for desktop automation MCP (puppeteer-core connects via this port) | |
| if (env.NODE_ENV === "development") { | |
| const cdpPort = String(env.DESKTOP_AUTOMATION_PORT); | |
| app.commandLine.appendSwitch("remote-debugging-port", cdpPort); | |
| app.commandLine.appendSwitch("remote-allow-origins", "*"); | |
| } |
🤖 Prompt for AI Agents
In `@apps/desktop/src/lib/electron-app/factories/app/setup.ts` around lines 86 -
91, Replace the raw process.env usage with the validated env import and update
the comment: use env.DESKTOP_AUTOMATION_PORT (already validated/coerced in
env.shared) instead of process.env.DESKTOP_AUTOMATION_PORT when building cdpPort
in the setup block that calls app.commandLine.appendSwitch, and change the
comment text from "playwright-core" to "puppeteer-core" to reflect the correct
automation library.
| async getPage(): Promise<Page> { | ||
| if (this.page && this.browser?.connected) { | ||
| await this.focusLock.inject(this.page); | ||
| return this.page; | ||
| } | ||
| return this.connect(); | ||
| } |
There was a problem hiding this comment.
Race condition: concurrent getPage() calls can trigger multiple connect() invocations.
If two MCP tool calls arrive simultaneously while this.page is null, both will pass the check on Line 22 and call connect() concurrently. This results in duplicate CDP connections, with the second overwriting this.browser/this.page and orphaning the first connection's disconnect handler.
A simple connection promise guard resolves this:
Proposed fix
export class ConnectionManager {
private browser: Browser | null = null;
private page: Page | null = null;
+ private connecting: Promise<Page> | null = null;
readonly consoleCapture = new ConsoleCapture();
readonly focusLock = new FocusLock();
async getPage(): Promise<Page> {
if (this.page && this.browser?.connected) {
await this.focusLock.inject(this.page);
return this.page;
}
- return this.connect();
+ if (!this.connecting) {
+ this.connecting = this.connect().finally(() => {
+ this.connecting = null;
+ });
+ }
+ return this.connecting;
}🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/connection/connection-manager.ts` around lines
21 - 27, The getPage() method has a race where concurrent calls can both detect
no page and call connect() concurrently; introduce a connection guard (e.g.,
this.connectingPromise) inside the ConnectionManager so that getPage() sets
this.connectingPromise = this.connect() when no existing promise exists, awaits
that single promise for all callers, and then clears the guard on success or
failure; ensure you still call this.focusLock.inject(this.page) after the
awaited promise resolves and clear the connectingPromise in a finally block to
avoid stalling future attempts.
| getLogs({ | ||
| level, | ||
| limit, | ||
| }: { | ||
| level?: number; | ||
| limit?: number; | ||
| }): ConsoleLogEntry[] { | ||
| let filtered = this.logs; | ||
| if (level !== undefined) { | ||
| filtered = filtered.filter((log) => log.level === level); | ||
| } | ||
| if (limit !== undefined) { | ||
| filtered = filtered.slice(-limit); | ||
| } | ||
| return filtered; | ||
| } |
There was a problem hiding this comment.
getLogs returns a direct reference to the internal array when no filters are applied.
On Line 40, let filtered = this.logs — when neither level nor limit is provided, the caller receives a direct reference to the internal logs array. External mutations would corrupt internal state.
Proposed fix
getLogs({
level,
limit,
}: {
level?: number;
limit?: number;
}): ConsoleLogEntry[] {
- let filtered = this.logs;
+ let filtered: ConsoleLogEntry[] = this.logs;
if (level !== undefined) {
filtered = filtered.filter((log) => log.level === level);
}
if (limit !== undefined) {
filtered = filtered.slice(-limit);
}
+ // Return a copy if no filter was applied to avoid leaking internal state
+ return filtered === this.logs ? [...filtered] : filtered;
- return filtered;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| getLogs({ | |
| level, | |
| limit, | |
| }: { | |
| level?: number; | |
| limit?: number; | |
| }): ConsoleLogEntry[] { | |
| let filtered = this.logs; | |
| if (level !== undefined) { | |
| filtered = filtered.filter((log) => log.level === level); | |
| } | |
| if (limit !== undefined) { | |
| filtered = filtered.slice(-limit); | |
| } | |
| return filtered; | |
| } | |
| getLogs({ | |
| level, | |
| limit, | |
| }: { | |
| level?: number; | |
| limit?: number; | |
| }): ConsoleLogEntry[] { | |
| let filtered: ConsoleLogEntry[] = this.logs; | |
| if (level !== undefined) { | |
| filtered = filtered.filter((log) => log.level === level); | |
| } | |
| if (limit !== undefined) { | |
| filtered = filtered.slice(-limit); | |
| } | |
| // Return a copy if no filter was applied to avoid leaking internal state | |
| return filtered === this.logs ? [...filtered] : filtered; | |
| } |
🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/console-capture/console-capture.ts` around lines
33 - 48, getLogs currently assigns filtered = this.logs which returns a direct
reference to the internal array; change getLogs (the function named getLogs
returning ConsoleLogEntry[]) to always return a copy instead of the original
(e.g., initialize filtered with a shallow copy like this.logs.slice() or
[...this.logs]) so callers never receive a reference to the internal this.logs;
keep the existing filtering by level and slicing for limit but ensure the
initial copy is used.
| } else if (testId) { | ||
| cssSelector = '[data-testid="' + testId + '"]'; | ||
| } else { |
There was a problem hiding this comment.
testId value is not escaped in the CSS selector string.
If data-testid contains a double-quote or backslash, the generated selector [data-testid="..."] will be malformed. Use CSS.escape() for consistency with how node.id is handled on Line 44.
Proposed fix
- cssSelector = '[data-testid="' + testId + '"]';
+ cssSelector = '[data-testid="' + CSS.escape(testId) + '"]';🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/dom-inspector/dom-inspector.ts` around lines 45
- 47, The code builds cssSelector with the raw testId which can break the
attribute selector when testId contains quotes/backslashes; update the branch
that sets cssSelector (the else if handling testId) to use CSS.escape(testId)
the same way node.id is handled (i.e., build the selector using '[data-testid="'
+ CSS.escape(testId) + '"]'). This ensures the generated selector is properly
escaped and robust.
| attach(page: Page) { | ||
| page.on("load", async () => { | ||
| this.locked = false; | ||
| if (this.timeout) { | ||
| // Re-inject if we were still in an active automation session | ||
| await this.inject(page); | ||
| } | ||
| }); | ||
| } |
There was a problem hiding this comment.
Unhandled promise rejection in attach load handler.
The async callback on Line 58 calls this.inject(page) which can throw (e.g., if the page is mid-navigation). Since event handlers don't propagate async errors, this would produce an unhandled promise rejection.
Proposed fix
attach(page: Page) {
- page.on("load", async () => {
+ page.on("load", () => {
this.locked = false;
if (this.timeout) {
- // Re-inject if we were still in an active automation session
- await this.inject(page);
+ this.inject(page).catch(() => {
+ // page may have navigated before injection completed
+ });
}
});
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| attach(page: Page) { | |
| page.on("load", async () => { | |
| this.locked = false; | |
| if (this.timeout) { | |
| // Re-inject if we were still in an active automation session | |
| await this.inject(page); | |
| } | |
| }); | |
| } | |
| attach(page: Page) { | |
| page.on("load", () => { | |
| this.locked = false; | |
| if (this.timeout) { | |
| this.inject(page).catch(() => { | |
| // page may have navigated before injection completed | |
| }); | |
| } | |
| }); | |
| } |
🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/focus-lock/focus-lock.ts` around lines 57 - 65,
The "load" event handler in attach (page.on("load", async () => { ... })) can
cause unhandled promise rejections because this.inject(page) is async and may
throw; update the handler to catch and handle errors from inject(page) (either
wrap the body in try/catch or call this.inject(page).catch(...)) and log or
swallow the error appropriately, keeping the existing semantics of setting
this.locked = false and re-injecting when this.timeout is truthy; reference the
attach method, the page.on("load", ...) handler, and the inject(page) call.
| } else if (testId) { | ||
| el = document.querySelectorAll('[data-testid="' + testId + '"]')[index]; | ||
| } else if (text) { |
There was a problem hiding this comment.
testId value is interpolated unsafely into a CSS selector string.
If testId contains a double quote (e.g., foo"bar), the selector breaks and could produce unexpected results. Use CSS.escape or build the selector safely.
Proposed fix
} else if (testId) {
- el = document.querySelectorAll('[data-testid="' + testId + '"]')[index];
+ el = document.querySelectorAll('[data-testid="' + CSS.escape(testId) + '"]')[index];
} else if (text) {🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/tools/click/click.ts` around lines 14 - 16, The
selector building in the branch using testId interpolates the raw testId into a
CSS selector (document.querySelectorAll('[data-testid="' + testId +
'"]')[index]) which breaks for values containing quotes and other special
characters; update that branch in click.ts to escape testId using CSS.escape (or
otherwise safely build the attribute selector) before passing it to
document.querySelectorAll so the selector becomes robust for values like
foo"bar, and keep using the same index lookup logic.
| async (args) => { | ||
| const page = await getPage(); | ||
| try { | ||
| const result = await page.evaluate(args.code as string); | ||
| return { | ||
| content: [ | ||
| { | ||
| type: "text" as const, | ||
| text: | ||
| typeof result === "string" | ||
| ? result | ||
| : JSON.stringify(result, null, 2), | ||
| }, | ||
| ], | ||
| }; | ||
| } catch (error) { | ||
| return { | ||
| content: [ | ||
| { | ||
| type: "text" as const, | ||
| text: `Error: ${String(error)}`, | ||
| }, | ||
| ], | ||
| isError: true, | ||
| }; | ||
| } |
There was a problem hiding this comment.
undefined result from page.evaluate will produce text: undefined.
When the evaluated code returns undefined (common for void expressions like assignments or console.log()), JSON.stringify(undefined, null, 2) returns undefined (the JS primitive), not the string "undefined". This means text will be undefined, which may violate the MCP protocol's expectation of a string value.
Proposed fix
const result = await page.evaluate(args.code as string);
return {
content: [
{
type: "text" as const,
text:
typeof result === "string"
? result
- : JSON.stringify(result, null, 2),
+ : JSON.stringify(result, null, 2) ?? "undefined",
},
],
};🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/tools/evaluate-js/evaluate-js.ts` around lines
14 - 39, The handler that calls page.evaluate may set content[0].text to the
primitive undefined when the evaluated code returns undefined; update the logic
in the async (args) => { ... } block (the code that computes result from
page.evaluate) so content[0].text is always a string—e.g. treat undefined
explicitly (map undefined to the string "undefined"), otherwise keep strings
as-is and JSON.stringify non-string values; change the expression that builds
text (the ternary using typeof result and JSON.stringify) to first check for
result === undefined and convert it to "undefined" before falling back to
JSON.stringify for objects.
| if (args.url) { | ||
| await page.goto(args.url as string); |
There was a problem hiding this comment.
Potential SSRF / unsafe navigation — validate the URL scheme.
args.url is passed directly to page.goto with no validation. A caller could supply file:///etc/passwd or javascript:... URLs. Consider restricting to http(s) schemes.
🛡️ Proposed fix
async (args) => {
const page = await getPage();
if (args.url) {
- await page.goto(args.url as string);
+ const url = new URL(args.url as string);
+ if (!["http:", "https:"].includes(url.protocol)) {
+ return {
+ content: [{ type: "text" as const, text: `Unsupported protocol: ${url.protocol}` }],
+ isError: true,
+ };
+ }
+ await page.goto(args.url as string);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (args.url) { | |
| await page.goto(args.url as string); | |
| if (args.url) { | |
| const url = new URL(args.url as string); | |
| if (!["http:", "https:"].includes(url.protocol)) { | |
| return { | |
| content: [{ type: "text" as const, text: `Unsupported protocol: ${url.protocol}` }], | |
| isError: true, | |
| }; | |
| } | |
| await page.goto(args.url as string); |
🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/tools/navigate/navigate.ts` around lines 21 -
22, The code passes args.url directly into page.goto causing potential
SSRF/unsafe schemes; update the navigate logic that uses args.url and page.goto
to validate and sanitize the URL first: parse args.url with the URL constructor
(or equivalent), ensure the protocol is either "http:" or "https:", reject or
throw/log and skip navigation for any other scheme (e.g., file:, javascript:,
data:), and only call page.goto when validation succeeds so navigation is
restricted to safe http(s) targets.
| if (args.clearFirst) { | ||
| // Select all then type to replace | ||
| await page.keyboard.down("Meta"); | ||
| await page.keyboard.press("a"); | ||
| await page.keyboard.up("Meta"); |
There was a problem hiding this comment.
Meta+A only selects all on macOS; Windows/Linux need Control+A.
This will silently fail to clear content on non-macOS platforms. Detect the platform or use a cross-platform approach.
Proposed fix
if (args.clearFirst) {
- // Select all then type to replace
- await page.keyboard.down("Meta");
- await page.keyboard.press("a");
- await page.keyboard.up("Meta");
+ // Select all then type to replace (cross-platform)
+ const modifier = process.platform === "darwin" ? "Meta" : "Control";
+ await page.keyboard.down(modifier);
+ await page.keyboard.press("a");
+ await page.keyboard.up(modifier);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (args.clearFirst) { | |
| // Select all then type to replace | |
| await page.keyboard.down("Meta"); | |
| await page.keyboard.press("a"); | |
| await page.keyboard.up("Meta"); | |
| if (args.clearFirst) { | |
| // Select all then type to replace (cross-platform) | |
| const modifier = process.platform === "darwin" ? "Meta" : "Control"; | |
| await page.keyboard.down(modifier); | |
| await page.keyboard.press("a"); | |
| await page.keyboard.up(modifier); | |
| } |
🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/tools/type-text/type-text.ts` around lines 29 -
33, The clear-first block uses the macOS-only "Meta" modifier so selection fails
on Windows/Linux; update the args.clearFirst logic to pick the modifier based on
platform (e.g., use 'Meta' when process.platform === "darwin" and 'Control'
otherwise) and then call page.keyboard.down(modifierKey),
page.keyboard.press("a"), page.keyboard.up(modifierKey) using the chosen
modifier; ensure you reference the existing args.clearFirst check and the
existing page.keyboard.down/press/up calls when making the change.
Summary
Claude Code ←stdio→ MCP Server ←CDP WebSocket→ Electron RendererKey Changes
--remote-debugging-portand--remote-allow-origins=*flags in dev modepage.on("console")events instead of preload script injectionpage.evaluate,page.mouse.click,page.keyboard,page.screenshot, etc.)src/server/handlersexpressforpuppeteer-core(Bun + Playwright has a WebSocket incompatibility)defaultViewport: nullpreserves Electron window sizeTest plan
Summary by CodeRabbit
New Features