Skip to content

Superset Electron MCP built on Puppeteer#1481

Merged
saddlepaddle merged 2 commits into
mainfrom
remote-control-electron
Feb 14, 2026
Merged

Superset Electron MCP built on Puppeteer#1481
saddlepaddle merged 2 commits into
mainfrom
remote-control-electron

Conversation

@saddlepaddle
Copy link
Copy Markdown
Collaborator

@saddlepaddle saddlepaddle commented Feb 14, 2026

Summary

  • Replace Express HTTP server + manual JS execution with direct CDP connection via puppeteer-core
  • New architecture: Claude Code ←stdio→ MCP Server ←CDP WebSocket→ Electron Renderer
  • Fixes Radix UI interaction issues (Select, Dropdown) by using proper CDP input events instead of synthetic dispatches
  • Net reduction of ~90 lines while adding focus-lock, console-capture, and auto-reconnect capabilities

Key Changes

  • Electron: Add --remote-debugging-port and --remote-allow-origins=* flags in dev mode
  • ConnectionManager: Lazy CDP connection via puppeteer-core with auto-reconnect on disconnect
  • FocusLock: Suppresses blur/focusout events when app loses focus to keep Radix UI dropdowns open between MCP tool calls
  • ConsoleCapture: Uses CDP page.on("console") events instead of preload script injection
  • All 9 tools: Rewritten from HTTP fetch calls to direct puppeteer Page API (page.evaluate, page.mouse.click, page.keyboard, page.screenshot, etc.)
  • Removed: Express server, HTTP client, all src/server/ handlers
  • Deps: Swapped express for puppeteer-core (Bun + Playwright has a WebSocket incompatibility)
  • Viewport: defaultViewport: null preserves Electron window size

Test plan

  • Screenshot captures full Electron window at correct resolution
  • Click by text, selector, aria-label, and coordinates all work
  • Navigate between views (Tasks, Settings, Workspaces)
  • Type into input fields and save via blur
  • Send keyboard shortcuts (Cmd+A, Backspace, etc.)
  • Evaluate arbitrary JS in renderer
  • DOM inspection returns element tree
  • Window info returns correct URL, title, viewport
  • Console log capture works
  • Viewport doesn't shrink on connect (defaultViewport: null)
  • Typecheck passes

Summary by CodeRabbit

New Features

  • Added a desktop automation system enabling remote scripting and control of the application through standardized protocols
  • Introduced automation tools for clicking UI elements, typing text, navigating pages, capturing screenshots, executing JavaScript code, inspecting DOM structure, and accessing console logs
  • Enabled Chrome DevTools Protocol remote debugging to support advanced automation scenarios and development workflows

Generic package that enables AI agents to interact with the running
Electron app: take screenshots, inspect DOM, click elements, type text,
send keyboard shortcuts, read console logs, and evaluate JS.

Two roles in one package:
- Library export: any Electron app imports createAutomationServer() in
  main process to start an HTTP automation server (dev-only)
- MCP binary: standalone stdio server translating MCP tool calls into
  HTTP requests to the automation server

Tools: take_screenshot, inspect_dom, click, type_text, send_keys,
get_console_logs, evaluate_js, navigate, get_window_info
…-core

Replace the Express HTTP server + manual JS execution architecture with
direct CDP (Chrome DevTools Protocol) connection via puppeteer-core. This
fixes issues with Radix UI interactions and simplifies the overall design.

Architecture: Claude Code ←stdio→ MCP Server ←CDP WebSocket→ Electron

Key changes:
- Add --remote-debugging-port and --remote-allow-origins flags to Electron
- New ConnectionManager for lazy CDP connection with auto-reconnect
- New FocusLock to suppress blur events between MCP tool calls
- New ConsoleCapture using CDP console events instead of preload injection
- All 9 tools rewritten to use puppeteer Page API directly
- Remove Express server, HTTP client, and all server/ handlers
- Swap express dep for puppeteer-core (Bun + Playwright is incompatible)
- Set defaultViewport: null to preserve Electron window size
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 14, 2026

📝 Walkthrough

Walkthrough

A new desktop automation MCP server package is introduced, enabling remote control of the Electron app via CDP with tools for DOM inspection, element interaction, console logging, and code execution, integrated into the MCP registry with environment-based port configuration.

Changes

Cohort / File(s) Summary
Configuration & Registry
.mcp.json, apps/desktop/package.json, apps/desktop/src/shared/env.shared.ts
Added MCP server entry for desktop-automation, workspace dependency on @superset/desktop-mcp, and DESKTOP_AUTOMATION_PORT env var (default 9223).
Electron App Integration
apps/desktop/src/lib/electron-app/factories/app/setup.ts
Enabled CDP remote debugging in development mode via DESKTOP_AUTOMATION_PORT for automated desktop testing.
MCP Server Core
packages/desktop-mcp/package.json, packages/desktop-mcp/src/bin.ts, packages/desktop-mcp/src/index.ts, packages/desktop-mcp/src/mcp/mcp-server.ts, packages/desktop-mcp/src/mcp/index.ts
Entry point configuration, server factory, and MCP server initialization with stdio transport.
Connection Management
packages/desktop-mcp/src/mcp/connection/connection-manager.ts, packages/desktop-mcp/src/mcp/connection/index.ts
ConnectionManager class for Puppeteer-based CDP connection with lazy connection, auto-reconnect, and console/focus-lock injection on reconnect.
Utilities
packages/desktop-mcp/src/mcp/console-capture/console-capture.ts, packages/desktop-mcp/src/mcp/console-capture/index.ts, packages/desktop-mcp/src/mcp/focus-lock/focus-lock.ts, packages/desktop-mcp/src/mcp/focus-lock/index.ts, packages/desktop-mcp/src/mcp/dom-inspector/dom-inspector.ts, packages/desktop-mcp/src/mcp/dom-inspector/index.ts
ConsoleCapture for logging browser messages, FocusLock for automation-aware focus handling, and DOM_INSPECTOR_SCRIPT for element discovery and metadata extraction.
Tool Implementations
packages/desktop-mcp/src/mcp/tools/click/click.ts, packages/desktop-mcp/src/mcp/tools/evaluate-js/evaluate-js.ts, packages/desktop-mcp/src/mcp/tools/get-console-logs/get-console-logs.ts, packages/desktop-mcp/src/mcp/tools/get-window-info/get-window-info.ts, packages/desktop-mcp/src/mcp/tools/inspect-dom/inspect-dom.ts, packages/desktop-mcp/src/mcp/tools/navigate/navigate.ts, packages/desktop-mcp/src/mcp/tools/send-keys/send-keys.ts, packages/desktop-mcp/src/mcp/tools/take-screenshot/take-screenshot.ts, packages/desktop-mcp/src/mcp/tools/type-text/type-text.ts, packages/desktop-mcp/src/mcp/tools/.../*.../index.ts
Nine tool registrations for DOM interaction (click, type-text, send-keys), navigation, inspection (inspect-dom, get-window-info), code execution (evaluate-js), automation feedback (get-console-logs, take-screenshot).
Tool Registry & Orchestration
packages/desktop-mcp/src/mcp/tools/index.ts
ToolContext interface and registerTools function aggregating all tool registrations into a unified setup routine.
Type Definitions & Schema
packages/desktop-mcp/src/zod.ts, packages/desktop-mcp/tsconfig.json
Zod schemas for request/response payloads across all tools with inferred TypeScript types and TypeScript configuration for the package.

Sequence Diagram

sequenceDiagram
    participant Startup as Electron Startup
    participant MCP as MCP Server<br/>(stdio)
    participant CM as ConnectionManager
    participant Puppeteer as Puppeteer
    participant Renderer as Electron<br/>Renderer
    participant Tool as Tool<br/>Handler

    Startup->>MCP: createMcpServer()
    MCP->>CM: new ConnectionManager()
    MCP->>MCP: registerTools(ToolContext)
    MCP->>MCP: await server.connect(transport)
    Note over MCP: Server ready & listening on stdio

    rect rgba(0, 150, 255, 0.5)
    Note over MCP,Renderer: Client sends tool request (e.g., "click")
    MCP->>Tool: Tool handler invoked
    Tool->>CM: getPage()
    CM->>Puppeteer: browser.connectOverCDP()
    Puppeteer->>Renderer: Connect to renderer process
    Puppeteer-->>CM: Page object
    CM-->>Tool: Page object
    Tool->>Renderer: Execute action (click, type, etc.)
    Renderer-->>Tool: Result/response
    Tool-->>MCP: Formatted MCP response
    MCP-->>MCP: Send response via stdio
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • PR #77: Modifies the same Electron app setup file (apps/desktop/src/lib/electron-app/factories/app/setup.ts) for CDP configuration integration.
  • PR #101: Touches both setup.ts and package.json for Electron desktop changes, sharing configuration infrastructure.
  • PR #979: Introduces or refactors MCP tool registration patterns and ToolContext types directly related to this tool registry design.

Poem

🐰 A rabbit bounds through the Electron maze,
With Puppeteer strings in a digital blaze;
Click, type, inspect the DOM with delight,
Nine shiny tools make automation bright!
The MCP server hops, ready to play—
Desktop control in a hop, skip, and sway! 🎪

🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (87 files):

⚔️ .github/workflows/deploy-preview.yml (content)
⚔️ .github/workflows/deploy-production.yml (content)
⚔️ .mcp.json (content)
⚔️ .superset/setup.sh (content)
⚔️ .superset/teardown.sh (content)
⚔️ apps/api/src/app/api/github/install/route.ts (content)
⚔️ apps/api/src/app/api/integrations/linear/connect/route.ts (content)
⚔️ apps/api/src/app/api/integrations/slack/connect/route.ts (content)
⚔️ apps/api/src/app/api/integrations/slack/events/process-app-home-opened/process-app-home-opened.ts (content)
⚔️ apps/api/src/app/api/integrations/slack/events/process-assistant-message/process-assistant-message.ts (content)
⚔️ apps/api/src/app/api/integrations/slack/events/process-mention/process-mention.ts (content)
⚔️ apps/api/src/app/api/integrations/slack/events/utils/run-agent/run-agent.ts (content)
⚔️ apps/api/src/app/api/integrations/slack/link/route.ts (content)
⚔️ apps/api/src/env.ts (content)
⚔️ apps/desktop/package.json (content)
⚔️ apps/desktop/scripts/patch-dev-protocol.ts (content)
⚔️ apps/desktop/src/lib/electron-app/factories/app/setup.ts (content)
⚔️ apps/desktop/src/lib/trpc/routers/ports/ports.ts (content)
⚔️ apps/desktop/src/lib/trpc/routers/projects/projects.ts (content)
⚔️ apps/desktop/src/lib/trpc/routers/terminal/terminal.ts (content)
⚔️ apps/desktop/src/main/lib/terminal/daemon/daemon-manager.ts (content)
⚔️ apps/desktop/src/main/lib/terminal/env.test.ts (content)
⚔️ apps/desktop/src/main/lib/terminal/env.ts (content)
⚔️ apps/desktop/src/main/lib/terminal/session.ts (content)
⚔️ apps/desktop/src/main/lib/terminal/types.ts (content)
⚔️ apps/desktop/src/renderer/react-query/projects/useReorderProjects.ts (content)
⚔️ apps/desktop/src/renderer/react-query/workspaces/useOpenExternalWorktree.ts (content)
⚔️ apps/desktop/src/renderer/react-query/workspaces/useOpenWorktree.ts (content)
⚔️ apps/desktop/src/renderer/react-query/workspaces/useReorderWorkspaces.ts (content)
⚔️ apps/desktop/src/renderer/routes/_authenticated/_dashboard/project/$projectId/page.tsx (content)
⚔️ apps/desktop/src/renderer/routes/_authenticated/_dashboard/tasks/$taskId/components/PropertiesSidebar/PropertiesSidebar.tsx (content)
⚔️ apps/desktop/src/renderer/routes/_authenticated/_dashboard/tasks/components/TasksView/hooks/useTasksTable/components/AssigneeCell/AssigneeCell.tsx (content)
⚔️ apps/desktop/src/renderer/routes/_authenticated/_dashboard/tasks/components/TasksView/hooks/useTasksTable/components/PriorityCell/PriorityCell.tsx (content)
⚔️ apps/desktop/src/renderer/routes/_authenticated/_dashboard/tasks/components/TasksView/hooks/useTasksTable/components/StatusCell/StatusCell.tsx (content)
⚔️ apps/desktop/src/renderer/routes/_authenticated/providers/CollectionsProvider/collections.ts (content)
⚔️ apps/desktop/src/renderer/routes/_authenticated/settings/components/SettingsSidebar/ProjectsSettings.tsx (content)
⚔️ apps/desktop/src/renderer/routes/_authenticated/settings/layout.tsx (content)
⚔️ apps/desktop/src/renderer/routes/_authenticated/settings/project/$projectId/components/ProjectSettings/ProjectSettings.tsx (content)
⚔️ apps/desktop/src/renderer/routes/_authenticated/settings/project/$projectId/components/ProjectSettings/index.ts (content)
⚔️ apps/desktop/src/renderer/routes/_authenticated/settings/project/$projectId/page.tsx (content)
⚔️ apps/desktop/src/renderer/routes/_authenticated/settings/utils/settings-search/settings-search.ts (content)
⚔️ apps/desktop/src/renderer/screens/main/components/WorkspaceInitEffects.tsx (content)
⚔️ apps/desktop/src/renderer/screens/main/components/WorkspaceSidebar/PortsList/hooks/usePortsData.ts (content)
⚔️ apps/desktop/src/renderer/screens/main/components/WorkspaceView/ContentView/TabsContent/TabView/index.tsx (content)
⚔️ apps/desktop/src/renderer/screens/main/components/WorkspaceView/ContentView/TabsContent/TabView/mosaic-theme.css (content)
⚔️ apps/desktop/src/renderer/screens/main/components/WorkspaceView/ContentView/TabsContent/Terminal/hooks/useTerminalConnection.ts (content)
⚔️ apps/desktop/src/renderer/screens/main/components/WorkspaceView/ContentView/TabsContent/Terminal/types.ts (content)
⚔️ apps/desktop/src/renderer/screens/main/components/WorkspaceView/ContentView/components/PresetsBar/PresetsBar.tsx (content)
⚔️ apps/desktop/src/renderer/stores/settings-state.ts (content)
⚔️ apps/desktop/src/renderer/stores/theme/utils/index.ts (content)
⚔️ apps/desktop/src/shared/env.shared.ts (content)
⚔️ apps/desktop/src/shared/tabs-types.ts (content)
⚔️ apps/desktop/src/shared/utils/branch.test.ts (content)
⚔️ apps/desktop/src/shared/utils/branch.ts (content)
⚔️ apps/desktop/src/shared/worktree-id.ts (content)
⚔️ apps/docs/src/env.ts (content)
⚔️ apps/electric-proxy/src/index.ts (content)
⚔️ apps/electric-proxy/wrangler.jsonc (content)
⚔️ apps/marketing/src/lib/blog-utils.ts (content)
⚔️ apps/marketing/src/lib/blog.ts (content)
⚔️ apps/marketing/src/lib/changelog-utils.ts (content)
⚔️ apps/marketing/src/lib/changelog.ts (content)
⚔️ apps/marketing/src/lib/compare-utils.ts (content)
⚔️ apps/marketing/src/lib/compare.ts (content)
⚔️ apps/mobile/lib/collections/collections.ts (content)
⚔️ apps/web/src/app/(dashboard)/settings/billing/page.tsx (content)
⚔️ bun.lock (content)
⚔️ packages/db/drizzle/meta/0018_snapshot.json (content)
⚔️ packages/db/drizzle/meta/_journal.json (content)
⚔️ packages/db/src/schema/enums.ts (content)
⚔️ packages/db/src/schema/index.ts (content)
⚔️ packages/db/src/schema/relations.ts (content)
⚔️ packages/db/src/schema/schema.ts (content)
⚔️ packages/db/src/utils/index.ts (content)
⚔️ packages/local-db/drizzle/meta/_journal.json (content)
⚔️ packages/local-db/src/schema/schema.ts (content)
⚔️ packages/mcp/package.json (content)
⚔️ packages/mcp/src/tools/devices/start-claude-session/start-claude-session.ts (content)
⚔️ packages/shared/package.json (content)
⚔️ packages/shared/src/constants.ts (content)
⚔️ packages/trpc/src/env.ts (content)
⚔️ packages/trpc/src/root.ts (content)
⚔️ packages/trpc/src/router/integration/utils.ts (content)
⚔️ packages/trpc/src/router/organization/organization.ts (content)
⚔️ packages/trpc/src/router/task/schema.ts (content)
⚔️ packages/trpc/src/router/task/task.ts (content)
⚔️ packages/trpc/src/router/user/user.ts (content)

These conflicts must be resolved before merging into main.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: migrating from Express HTTP to CDP via puppeteer-core, which aligns with the PR's core objective and the substantial refactoring work shown in the changeset.
Description check ✅ Passed The PR description comprehensively covers all required template sections: a detailed Summary, Key Changes, and a complete Test plan. It provides clear context on the architecture change, specific implementation details, and testing coverage.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch remote-control-electron
⚔️ Resolve merge conflicts (beta)
  • Auto-commit resolved conflicts to branch remote-control-electron
  • Create stacked PR with resolved conflicts
  • Post resolved changes as copyable diffs in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 14, 2026

🧹 Preview Cleanup Complete

The following preview resources have been cleaned up:

  • ✅ Neon database branch
  • ⚠️ Electric Fly.io app
  • ⚠️ Streams Fly.io app

Thank you for your contribution! 🎉

@saddlepaddle saddlepaddle changed the title refactor(desktop-mcp): migrate from Express HTTP to CDP via puppeteer-core Superset Electron MCP built on Puppeteer Feb 14, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🤖 Fix all issues with AI agents
In `@apps/desktop/src/lib/electron-app/factories/app/setup.ts`:
- Around line 86-91: Replace the raw process.env usage with the validated env
import and update the comment: use env.DESKTOP_AUTOMATION_PORT (already
validated/coerced in env.shared) instead of process.env.DESKTOP_AUTOMATION_PORT
when building cdpPort in the setup block that calls
app.commandLine.appendSwitch, and change the comment text from "playwright-core"
to "puppeteer-core" to reflect the correct automation library.

In `@packages/desktop-mcp/src/mcp/connection/connection-manager.ts`:
- Around line 21-27: The getPage() method has a race where concurrent calls can
both detect no page and call connect() concurrently; introduce a connection
guard (e.g., this.connectingPromise) inside the ConnectionManager so that
getPage() sets this.connectingPromise = this.connect() when no existing promise
exists, awaits that single promise for all callers, and then clears the guard on
success or failure; ensure you still call this.focusLock.inject(this.page) after
the awaited promise resolves and clear the connectingPromise in a finally block
to avoid stalling future attempts.

In `@packages/desktop-mcp/src/mcp/console-capture/console-capture.ts`:
- Around line 33-48: getLogs currently assigns filtered = this.logs which
returns a direct reference to the internal array; change getLogs (the function
named getLogs returning ConsoleLogEntry[]) to always return a copy instead of
the original (e.g., initialize filtered with a shallow copy like
this.logs.slice() or [...this.logs]) so callers never receive a reference to the
internal this.logs; keep the existing filtering by level and slicing for limit
but ensure the initial copy is used.

In `@packages/desktop-mcp/src/mcp/dom-inspector/dom-inspector.ts`:
- Around line 45-47: The code builds cssSelector with the raw testId which can
break the attribute selector when testId contains quotes/backslashes; update the
branch that sets cssSelector (the else if handling testId) to use
CSS.escape(testId) the same way node.id is handled (i.e., build the selector
using '[data-testid="' + CSS.escape(testId) + '"]'). This ensures the generated
selector is properly escaped and robust.

In `@packages/desktop-mcp/src/mcp/focus-lock/focus-lock.ts`:
- Around line 57-65: The "load" event handler in attach (page.on("load", async
() => { ... })) can cause unhandled promise rejections because this.inject(page)
is async and may throw; update the handler to catch and handle errors from
inject(page) (either wrap the body in try/catch or call
this.inject(page).catch(...)) and log or swallow the error appropriately,
keeping the existing semantics of setting this.locked = false and re-injecting
when this.timeout is truthy; reference the attach method, the page.on("load",
...) handler, and the inject(page) call.

In `@packages/desktop-mcp/src/mcp/tools/click/click.ts`:
- Around line 14-16: The selector building in the branch using testId
interpolates the raw testId into a CSS selector
(document.querySelectorAll('[data-testid="' + testId + '"]')[index]) which
breaks for values containing quotes and other special characters; update that
branch in click.ts to escape testId using CSS.escape (or otherwise safely build
the attribute selector) before passing it to document.querySelectorAll so the
selector becomes robust for values like foo"bar, and keep using the same index
lookup logic.

In `@packages/desktop-mcp/src/mcp/tools/evaluate-js/evaluate-js.ts`:
- Around line 14-39: The handler that calls page.evaluate may set
content[0].text to the primitive undefined when the evaluated code returns
undefined; update the logic in the async (args) => { ... } block (the code that
computes result from page.evaluate) so content[0].text is always a string—e.g.
treat undefined explicitly (map undefined to the string "undefined"), otherwise
keep strings as-is and JSON.stringify non-string values; change the expression
that builds text (the ternary using typeof result and JSON.stringify) to first
check for result === undefined and convert it to "undefined" before falling back
to JSON.stringify for objects.

In `@packages/desktop-mcp/src/mcp/tools/navigate/navigate.ts`:
- Around line 21-22: The code passes args.url directly into page.goto causing
potential SSRF/unsafe schemes; update the navigate logic that uses args.url and
page.goto to validate and sanitize the URL first: parse args.url with the URL
constructor (or equivalent), ensure the protocol is either "http:" or "https:",
reject or throw/log and skip navigation for any other scheme (e.g., file:,
javascript:, data:), and only call page.goto when validation succeeds so
navigation is restricted to safe http(s) targets.

In `@packages/desktop-mcp/src/mcp/tools/type-text/type-text.ts`:
- Around line 29-33: The clear-first block uses the macOS-only "Meta" modifier
so selection fails on Windows/Linux; update the args.clearFirst logic to pick
the modifier based on platform (e.g., use 'Meta' when process.platform ===
"darwin" and 'Control' otherwise) and then call page.keyboard.down(modifierKey),
page.keyboard.press("a"), page.keyboard.up(modifierKey) using the chosen
modifier; ensure you reference the existing args.clearFirst check and the
existing page.keyboard.down/press/up calls when making the change.
🧹 Nitpick comments (10)
packages/desktop-mcp/src/mcp/tools/navigate/navigate.ts (1)

10-16: Input schema allows both url and path simultaneously — consider requiring exactly one.

Both fields are optional, and providing both silently ignores path. A Zod refinement (or z.union/z.discriminatedUnion) would make the intent explicit and prevent ambiguous inputs.

packages/desktop-mcp/src/mcp/tools/send-keys/send-keys.ts (2)

56-89: args.keys is typed as unknown — prefer narrowing over repeated as string[] casts.

The args parameter from the MCP SDK callback is loosely typed. Instead of casting args.keys as string[] in two places (Lines 58 and 85), parse/narrow once at the top of the handler and reuse the typed result throughout.

Proposed fix
 		async (args) => {
-			const page = await getPage();
-			const keys = (args.keys as string[]).map(normalizeKey);
+			const page = await getPage();
+			const rawKeys = args.keys as string[];
+			const keys = rawKeys.map(normalizeKey);
 
 			const modifiers = keys.filter((k) => MODIFIER_KEYS.has(k));
 			const nonModifiers = keys.filter((k) => !MODIFIER_KEYS.has(k));
 			...
 			return {
 				content: [
 					{
 						type: "text" as const,
-						text: `Sent keys: ${(args.keys as string[]).join("+")}`,
+						text: `Sent keys: ${rawKeys.join("+")}`,
 					},
 				],
 			};

77-78: Array.prototype.reverse() mutates in-place — safe here but consider .toReversed().

modifiers.reverse() mutates the modifiers array. It's safe because modifiers isn't used afterwards, but .toReversed() (ES2023) would make the intent clearer and prevent subtle bugs if code is later added below.

packages/desktop-mcp/src/mcp/console-capture/console-capture.ts (1)

18-31: Repeated attach() calls will stack duplicate listeners on the same ConsoleCapture instance.

Since ConsoleCapture is a singleton on ConnectionManager (see connection-manager.ts Line 18), each reconnect calls attach() again. While the old Page object is discarded, there's no guard against accidental double-attach on the same page. Consider adding a guard or removing the previous listener.

Also, this.logs.shift() on Line 29 is O(n) per insertion when at capacity. For a 500-item cap this is likely fine in practice, but if the cap ever increases, a circular buffer would be preferable.

Suggested guard against re-attachment
 export class ConsoleCapture {
 	private logs: ConsoleLogEntry[] = [];
 	private maxSize = 500;
+	private attached = false;
 
 	attach(page: Page) {
+		if (this.attached) return;
+		this.attached = true;
 		page.on("console", (msg: ConsoleMessage) => {

Actually, since reconnects produce new Page instances and you want to capture from the new page, a better approach might be to track the current page and re-wire:

Alternative: re-wire to new page
 export class ConsoleCapture {
 	private logs: ConsoleLogEntry[] = [];
 	private maxSize = 500;
+	private handler: ((msg: ConsoleMessage) => void) | null = null;
+	private currentPage: Page | null = null;

 	attach(page: Page) {
+		// Detach from previous page if any
+		if (this.currentPage && this.handler) {
+			this.currentPage.off("console", this.handler);
+		}
+		this.currentPage = page;
-		page.on("console", (msg: ConsoleMessage) => {
+		this.handler = (msg: ConsoleMessage) => {
 			const level = LEVEL_MAP[msg.type()] ?? 1;
 			...
-		});
+		};
+		page.on("console", this.handler);
 	}
packages/desktop-mcp/src/mcp/focus-lock/focus-lock.ts (1)

85-89: Silent catch — add a debug log per project conventions.

The catch block silently swallows errors. Per project guidelines, errors should at minimum be logged with context. This aids debugging when the page is unexpectedly destroyed.

Proposed fix
 		try {
 			await page.evaluate(UNLOCK_SCRIPT);
-		} catch {
-			// page may have navigated or been destroyed
+		} catch (err) {
+			console.debug("[focus-lock/unlock] page may have navigated or been destroyed", err);
 		}

Based on learnings: "Use prefixed console logging with consistent context pattern: [domain/operation] message for entry/exit of significant operations, external API calls, and error conditions"

packages/desktop-mcp/src/mcp/connection/connection-manager.ts (1)

29-58: Good structure for the connection lifecycle.

The lazy-connect + disconnect-handler pattern is clean. The page filtering (skipping chrome-extension://) and re-injection of ConsoleCapture/FocusLock on connect are well thought out.

Two minor observations:

  1. Magic numbers: 60_000 (Line 32) could be a named constant (e.g., PROTOCOL_TIMEOUT_MS) alongside the existing CDP_PORT for consistency.
  2. Error context on connect failure: If puppeteer.connect fails (Electron not running), the raw puppeteer error may be opaque. Consider wrapping with a [desktop-mcp/connect] prefixed message.
packages/desktop-mcp/src/zod.ts (1)

41-45: Type asymmetry between ConsoleLogsRequestSchema.level (string enum) and ConsoleLogEntrySchema.level (number).

The request schema uses z.enum(["log", "warn", "error", "info", "debug"]) for level, while ConsoleLogEntrySchema (Line 111) stores level as z.number(). The tool handler must map between these representations. This is workable, but could be a source of bugs if a new level string is added to the request enum without updating the numeric mapping in console-capture.ts.

Consider co-locating the level mapping (e.g., a shared LEVEL_MAP or a shared enum) so changes stay in sync.

packages/desktop-mcp/src/bin.ts (1)

5-7: Add error handling and graceful shutdown.

A bare top-level await server.connect(transport) will produce an unhandled rejection with no context if the connection fails. There's also no cleanup on process signals, so CDP connections may leak when the process is killed.

Suggested improvement
-const server = createMcpServer();
-const transport = new StdioServerTransport();
-await server.connect(transport);
+const server = createMcpServer();
+const transport = new StdioServerTransport();
+
+process.on("SIGINT", async () => {
+	await server.close();
+	process.exit(0);
+});
+process.on("SIGTERM", async () => {
+	await server.close();
+	process.exit(0);
+});
+
+await server.connect(transport).catch((err) => {
+	console.error("[desktop-mcp/bin] failed to start:", err);
+	process.exit(1);
+});

As per coding guidelines: "Never swallow errors silently; at minimum log errors with context before rethrowing or handling them explicitly" and "Use prefixed console logging with consistent context pattern: [domain/operation] message."

packages/desktop-mcp/src/mcp/tools/type-text/type-text.ts (1)

22-46: No error handling around page interactions.

Other tools like evaluate_js wrap page calls in try/catch and return isError: true on failure. This tool will throw unhandled if page.click(selector) fails (e.g., selector not found). Consider adding a try/catch for consistency with the other tools.

packages/desktop-mcp/src/mcp/tools/click/click.ts (1)

73-127: Missing error handling for the element-find-and-click path.

Unlike evaluate_js, this tool doesn't wrap page interactions in try/catch. If page.evaluate or page.mouse.click throws (e.g., navigation mid-click, detached frame), the error surfaces as an unhandled exception rather than a structured MCP error response. Consider wrapping in try/catch and returning isError: true for consistency.

Comment on lines +86 to +91
// Enable CDP for desktop automation MCP (playwright-core connects via this port)
if (env.NODE_ENV === "development") {
const cdpPort = String(process.env.DESKTOP_AUTOMATION_PORT || 9223);
app.commandLine.appendSwitch("remote-debugging-port", cdpPort);
app.commandLine.appendSwitch("remote-allow-origins", "*");
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use the parsed env object instead of raw process.env.

Line 88 reads process.env.DESKTOP_AUTOMATION_PORT directly, bypassing the Zod-validated env already imported on line 7 and used elsewhere in this file. This duplicates the default 9223 (already in env.shared.ts) and skips coercion/validation. Also, the comment says "playwright-core" but this PR uses puppeteer-core.

Proposed fix
-// Enable CDP for desktop automation MCP (playwright-core connects via this port)
+// Enable CDP for desktop automation MCP (puppeteer-core connects via this port)
 if (env.NODE_ENV === "development") {
-	const cdpPort = String(process.env.DESKTOP_AUTOMATION_PORT || 9223);
+	const cdpPort = String(env.DESKTOP_AUTOMATION_PORT);
 	app.commandLine.appendSwitch("remote-debugging-port", cdpPort);
 	app.commandLine.appendSwitch("remote-allow-origins", "*");
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Enable CDP for desktop automation MCP (playwright-core connects via this port)
if (env.NODE_ENV === "development") {
const cdpPort = String(process.env.DESKTOP_AUTOMATION_PORT || 9223);
app.commandLine.appendSwitch("remote-debugging-port", cdpPort);
app.commandLine.appendSwitch("remote-allow-origins", "*");
}
// Enable CDP for desktop automation MCP (puppeteer-core connects via this port)
if (env.NODE_ENV === "development") {
const cdpPort = String(env.DESKTOP_AUTOMATION_PORT);
app.commandLine.appendSwitch("remote-debugging-port", cdpPort);
app.commandLine.appendSwitch("remote-allow-origins", "*");
}
🤖 Prompt for AI Agents
In `@apps/desktop/src/lib/electron-app/factories/app/setup.ts` around lines 86 -
91, Replace the raw process.env usage with the validated env import and update
the comment: use env.DESKTOP_AUTOMATION_PORT (already validated/coerced in
env.shared) instead of process.env.DESKTOP_AUTOMATION_PORT when building cdpPort
in the setup block that calls app.commandLine.appendSwitch, and change the
comment text from "playwright-core" to "puppeteer-core" to reflect the correct
automation library.

Comment on lines +21 to +27
async getPage(): Promise<Page> {
if (this.page && this.browser?.connected) {
await this.focusLock.inject(this.page);
return this.page;
}
return this.connect();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Race condition: concurrent getPage() calls can trigger multiple connect() invocations.

If two MCP tool calls arrive simultaneously while this.page is null, both will pass the check on Line 22 and call connect() concurrently. This results in duplicate CDP connections, with the second overwriting this.browser/this.page and orphaning the first connection's disconnect handler.

A simple connection promise guard resolves this:

Proposed fix
 export class ConnectionManager {
 	private browser: Browser | null = null;
 	private page: Page | null = null;
+	private connecting: Promise<Page> | null = null;

 	readonly consoleCapture = new ConsoleCapture();
 	readonly focusLock = new FocusLock();

 	async getPage(): Promise<Page> {
 		if (this.page && this.browser?.connected) {
 			await this.focusLock.inject(this.page);
 			return this.page;
 		}
-		return this.connect();
+		if (!this.connecting) {
+			this.connecting = this.connect().finally(() => {
+				this.connecting = null;
+			});
+		}
+		return this.connecting;
 	}
🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/connection/connection-manager.ts` around lines
21 - 27, The getPage() method has a race where concurrent calls can both detect
no page and call connect() concurrently; introduce a connection guard (e.g.,
this.connectingPromise) inside the ConnectionManager so that getPage() sets
this.connectingPromise = this.connect() when no existing promise exists, awaits
that single promise for all callers, and then clears the guard on success or
failure; ensure you still call this.focusLock.inject(this.page) after the
awaited promise resolves and clear the connectingPromise in a finally block to
avoid stalling future attempts.

Comment on lines +33 to +48
getLogs({
level,
limit,
}: {
level?: number;
limit?: number;
}): ConsoleLogEntry[] {
let filtered = this.logs;
if (level !== undefined) {
filtered = filtered.filter((log) => log.level === level);
}
if (limit !== undefined) {
filtered = filtered.slice(-limit);
}
return filtered;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

getLogs returns a direct reference to the internal array when no filters are applied.

On Line 40, let filtered = this.logs — when neither level nor limit is provided, the caller receives a direct reference to the internal logs array. External mutations would corrupt internal state.

Proposed fix
 	getLogs({
 		level,
 		limit,
 	}: {
 		level?: number;
 		limit?: number;
 	}): ConsoleLogEntry[] {
-		let filtered = this.logs;
+		let filtered: ConsoleLogEntry[] = this.logs;
 		if (level !== undefined) {
 			filtered = filtered.filter((log) => log.level === level);
 		}
 		if (limit !== undefined) {
 			filtered = filtered.slice(-limit);
 		}
+		// Return a copy if no filter was applied to avoid leaking internal state
+		return filtered === this.logs ? [...filtered] : filtered;
-		return filtered;
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
getLogs({
level,
limit,
}: {
level?: number;
limit?: number;
}): ConsoleLogEntry[] {
let filtered = this.logs;
if (level !== undefined) {
filtered = filtered.filter((log) => log.level === level);
}
if (limit !== undefined) {
filtered = filtered.slice(-limit);
}
return filtered;
}
getLogs({
level,
limit,
}: {
level?: number;
limit?: number;
}): ConsoleLogEntry[] {
let filtered: ConsoleLogEntry[] = this.logs;
if (level !== undefined) {
filtered = filtered.filter((log) => log.level === level);
}
if (limit !== undefined) {
filtered = filtered.slice(-limit);
}
// Return a copy if no filter was applied to avoid leaking internal state
return filtered === this.logs ? [...filtered] : filtered;
}
🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/console-capture/console-capture.ts` around lines
33 - 48, getLogs currently assigns filtered = this.logs which returns a direct
reference to the internal array; change getLogs (the function named getLogs
returning ConsoleLogEntry[]) to always return a copy instead of the original
(e.g., initialize filtered with a shallow copy like this.logs.slice() or
[...this.logs]) so callers never receive a reference to the internal this.logs;
keep the existing filtering by level and slicing for limit but ensure the
initial copy is used.

Comment on lines +45 to +47
} else if (testId) {
cssSelector = '[data-testid="' + testId + '"]';
} else {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

testId value is not escaped in the CSS selector string.

If data-testid contains a double-quote or backslash, the generated selector [data-testid="..."] will be malformed. Use CSS.escape() for consistency with how node.id is handled on Line 44.

Proposed fix
-        cssSelector = '[data-testid="' + testId + '"]';
+        cssSelector = '[data-testid="' + CSS.escape(testId) + '"]';
🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/dom-inspector/dom-inspector.ts` around lines 45
- 47, The code builds cssSelector with the raw testId which can break the
attribute selector when testId contains quotes/backslashes; update the branch
that sets cssSelector (the else if handling testId) to use CSS.escape(testId)
the same way node.id is handled (i.e., build the selector using '[data-testid="'
+ CSS.escape(testId) + '"]'). This ensures the generated selector is properly
escaped and robust.

Comment on lines +57 to +65
attach(page: Page) {
page.on("load", async () => {
this.locked = false;
if (this.timeout) {
// Re-inject if we were still in an active automation session
await this.inject(page);
}
});
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Unhandled promise rejection in attach load handler.

The async callback on Line 58 calls this.inject(page) which can throw (e.g., if the page is mid-navigation). Since event handlers don't propagate async errors, this would produce an unhandled promise rejection.

Proposed fix
 	attach(page: Page) {
-		page.on("load", async () => {
+		page.on("load", () => {
 			this.locked = false;
 			if (this.timeout) {
-				// Re-inject if we were still in an active automation session
-				await this.inject(page);
+				this.inject(page).catch(() => {
+					// page may have navigated before injection completed
+				});
 			}
 		});
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
attach(page: Page) {
page.on("load", async () => {
this.locked = false;
if (this.timeout) {
// Re-inject if we were still in an active automation session
await this.inject(page);
}
});
}
attach(page: Page) {
page.on("load", () => {
this.locked = false;
if (this.timeout) {
this.inject(page).catch(() => {
// page may have navigated before injection completed
});
}
});
}
🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/focus-lock/focus-lock.ts` around lines 57 - 65,
The "load" event handler in attach (page.on("load", async () => { ... })) can
cause unhandled promise rejections because this.inject(page) is async and may
throw; update the handler to catch and handle errors from inject(page) (either
wrap the body in try/catch or call this.inject(page).catch(...)) and log or
swallow the error appropriately, keeping the existing semantics of setting
this.locked = false and re-injecting when this.timeout is truthy; reference the
attach method, the page.on("load", ...) handler, and the inject(page) call.

Comment on lines +14 to +16
} else if (testId) {
el = document.querySelectorAll('[data-testid="' + testId + '"]')[index];
} else if (text) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

testId value is interpolated unsafely into a CSS selector string.

If testId contains a double quote (e.g., foo"bar), the selector breaks and could produce unexpected results. Use CSS.escape or build the selector safely.

Proposed fix
 	} else if (testId) {
-		el = document.querySelectorAll('[data-testid="' + testId + '"]')[index];
+		el = document.querySelectorAll('[data-testid="' + CSS.escape(testId) + '"]')[index];
 	} else if (text) {
🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/tools/click/click.ts` around lines 14 - 16, The
selector building in the branch using testId interpolates the raw testId into a
CSS selector (document.querySelectorAll('[data-testid="' + testId +
'"]')[index]) which breaks for values containing quotes and other special
characters; update that branch in click.ts to escape testId using CSS.escape (or
otherwise safely build the attribute selector) before passing it to
document.querySelectorAll so the selector becomes robust for values like
foo"bar, and keep using the same index lookup logic.

Comment on lines +14 to +39
async (args) => {
const page = await getPage();
try {
const result = await page.evaluate(args.code as string);
return {
content: [
{
type: "text" as const,
text:
typeof result === "string"
? result
: JSON.stringify(result, null, 2),
},
],
};
} catch (error) {
return {
content: [
{
type: "text" as const,
text: `Error: ${String(error)}`,
},
],
isError: true,
};
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

undefined result from page.evaluate will produce text: undefined.

When the evaluated code returns undefined (common for void expressions like assignments or console.log()), JSON.stringify(undefined, null, 2) returns undefined (the JS primitive), not the string "undefined". This means text will be undefined, which may violate the MCP protocol's expectation of a string value.

Proposed fix
 			const result = await page.evaluate(args.code as string);
 			return {
 				content: [
 					{
 						type: "text" as const,
 						text:
 							typeof result === "string"
 								? result
-								: JSON.stringify(result, null, 2),
+								: JSON.stringify(result, null, 2) ?? "undefined",
 					},
 				],
 			};
🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/tools/evaluate-js/evaluate-js.ts` around lines
14 - 39, The handler that calls page.evaluate may set content[0].text to the
primitive undefined when the evaluated code returns undefined; update the logic
in the async (args) => { ... } block (the code that computes result from
page.evaluate) so content[0].text is always a string—e.g. treat undefined
explicitly (map undefined to the string "undefined"), otherwise keep strings
as-is and JSON.stringify non-string values; change the expression that builds
text (the ternary using typeof result and JSON.stringify) to first check for
result === undefined and convert it to "undefined" before falling back to
JSON.stringify for objects.

Comment on lines +21 to +22
if (args.url) {
await page.goto(args.url as string);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential SSRF / unsafe navigation — validate the URL scheme.

args.url is passed directly to page.goto with no validation. A caller could supply file:///etc/passwd or javascript:... URLs. Consider restricting to http(s) schemes.

🛡️ Proposed fix
 		async (args) => {
 			const page = await getPage();

 			if (args.url) {
-				await page.goto(args.url as string);
+				const url = new URL(args.url as string);
+				if (!["http:", "https:"].includes(url.protocol)) {
+					return {
+						content: [{ type: "text" as const, text: `Unsupported protocol: ${url.protocol}` }],
+						isError: true,
+					};
+				}
+				await page.goto(args.url as string);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (args.url) {
await page.goto(args.url as string);
if (args.url) {
const url = new URL(args.url as string);
if (!["http:", "https:"].includes(url.protocol)) {
return {
content: [{ type: "text" as const, text: `Unsupported protocol: ${url.protocol}` }],
isError: true,
};
}
await page.goto(args.url as string);
🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/tools/navigate/navigate.ts` around lines 21 -
22, The code passes args.url directly into page.goto causing potential
SSRF/unsafe schemes; update the navigate logic that uses args.url and page.goto
to validate and sanitize the URL first: parse args.url with the URL constructor
(or equivalent), ensure the protocol is either "http:" or "https:", reject or
throw/log and skip navigation for any other scheme (e.g., file:, javascript:,
data:), and only call page.goto when validation succeeds so navigation is
restricted to safe http(s) targets.

Comment on lines +29 to +33
if (args.clearFirst) {
// Select all then type to replace
await page.keyboard.down("Meta");
await page.keyboard.press("a");
await page.keyboard.up("Meta");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Meta+A only selects all on macOS; Windows/Linux need Control+A.

This will silently fail to clear content on non-macOS platforms. Detect the platform or use a cross-platform approach.

Proposed fix
 		if (args.clearFirst) {
-			// Select all then type to replace
-			await page.keyboard.down("Meta");
-			await page.keyboard.press("a");
-			await page.keyboard.up("Meta");
+			// Select all then type to replace (cross-platform)
+			const modifier = process.platform === "darwin" ? "Meta" : "Control";
+			await page.keyboard.down(modifier);
+			await page.keyboard.press("a");
+			await page.keyboard.up(modifier);
 		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (args.clearFirst) {
// Select all then type to replace
await page.keyboard.down("Meta");
await page.keyboard.press("a");
await page.keyboard.up("Meta");
if (args.clearFirst) {
// Select all then type to replace (cross-platform)
const modifier = process.platform === "darwin" ? "Meta" : "Control";
await page.keyboard.down(modifier);
await page.keyboard.press("a");
await page.keyboard.up(modifier);
}
🤖 Prompt for AI Agents
In `@packages/desktop-mcp/src/mcp/tools/type-text/type-text.ts` around lines 29 -
33, The clear-first block uses the macOS-only "Meta" modifier so selection fails
on Windows/Linux; update the args.clearFirst logic to pick the modifier based on
platform (e.g., use 'Meta' when process.platform === "darwin" and 'Control'
otherwise) and then call page.keyboard.down(modifierKey),
page.keyboard.press("a"), page.keyboard.up(modifierKey) using the chosen
modifier; ensure you reference the existing args.clearFirst check and the
existing page.keyboard.down/press/up calls when making the change.

@saddlepaddle saddlepaddle merged commit 679f32c into main Feb 14, 2026
14 of 15 checks passed
@Kitenite Kitenite deleted the remote-control-electron branch February 27, 2026 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant