Skip to content

Host Browser Proxy — Phase 2 (Chrome extension as CDP proxy)#24159

Merged
noanflaherty merged 35 commits into
mainfrom
noanflaherty/host-browser-proxy-phase-2
Apr 8, 2026
Merged

Host Browser Proxy — Phase 2 (Chrome extension as CDP proxy)#24159
noanflaherty merged 35 commits into
mainfrom
noanflaherty/host-browser-proxy-phase-2

Conversation

@noanflaherty
Copy link
Copy Markdown
Contributor

@noanflaherty noanflaherty commented Apr 7, 2026

Summary

Phase 2 of the Browser Use Architecture TDD. Refactors the Chrome extension into a thin chrome.debugger CDP JSON-RPC proxy and ships both cloud-hosted (extension → Vellum gateway) and self-hosted (extension → local daemon via Chrome Native Messaging) transports. Bundles the Phase 1 quick follow-ups (Swift host_browser decoders + sibling host-proxy defensive backports).

The legacy ExtensionCommand protocol stays alive throughout Phase 2 behind `vellum.cdpProxyEnabled` in chrome.storage.local. Phase 3 will flip the default and migrate user-facing browser tools to call hostBrowserProxy.request().

PRs merged into feature branch

Wave 1 (foundations, parallel)

Wave 2 (integration, parallel)

Wave 3 (auth flows, parallel)

Wave 4 (E2E smoke tests, parallel)

Non-goals (deferred to Phase 3+)

  • Migrating browser-execution.ts to call hostBrowserProxy.request() (Phase 3)
  • Deleting browser-extension-relay/protocol.ts and server.ts (Phase 3)
  • Chrome 146+ chrome://inspect attach backend (Phase 4)
  • Headless Playwright backend (Phase 5)
  • Cross-browser BiDi
  • Capability tokens at /v1/browser-relay WebSocket layer (Phase 3 — currently only JWTs accepted)

Part of plan: `.private/plans/host-browser-proxy-phase-2.md`


Open with Devin

noanflaherty and others added 17 commits April 7, 2026 17:10
The main-branch release commit (#24108) bumped assistant/package.json to
0.6.2 but did not regenerate the openapi spec. Regenerate it on the feature
branch so CI's OpenAPI Spec Check passes for Phase 2 PRs.
…cel messages (#24113)

* feat(clients/macos): decode host_browser_request and host_browser_cancel messages

* fix: type HostBrowserRequest.timeoutSeconds as Double?

Matches the daemon's number-typed wire contract and mirrors
HostBashRequest.timeoutSeconds, so fractional timeouts like 0.01s don't
throw a type-mismatch and drop the whole host_browser_request event.
…ion backend stub (#24110)

* feat(browser-session): add BrowserSessionManager scaffold with extension backend stub

* test(browser-session): import public API via index.ts to satisfy knip

Updates manager.test.ts to consume BrowserSessionManager, createExtensionBackend,
and types through the public ../index.js entry point instead of deep-importing
../manager.js and ../backends/extension.js. This keeps knip happy during the
scaffold phase: index.ts becomes a transitively-reachable entry point from
src/**/__tests__/**/*.ts before any production module consumes it.

* fix(browser-session): enforce session existence in BrowserSessionManager.send

Throws when the caller passes a sessionId that doesn't exist or has
been disposed. Still advisory for single-backend Phase 2, but makes
disposeSession() an actual enforcement boundary so commands can't run
against stale ids once Phase 4 adds multi-backend routing.
* feat(chrome-extension): add standalone CDP proxy module

* fix(chrome-extension): inject runtime.lastError and thread sessionId through CDP proxy

- Add runtime.lastError to ChromeDebuggerApi so mocked tests can surface errors
- Fold frame.sessionId into sendCommand params for flat-session routing
- Extract sessionId from event params when building CdpEventFrame
- Document flat-session handling in the module docstring

* fix(chrome-extension): route flat-session sessionId through DebuggerSession target

Chrome 125+ debugger.sendCommand takes sessionId on the target argument
(DebuggerSession), not inside commandParams. Switch back to passing
sessionId on the target. Same change on the onEvent listener — read
sessionId from 'source' rather than params, since flat-session events
surface it on the source.

Also clean up the module docstring to drop PR-level narrative per
clients/AGENTS.md's comment quality rule.

* fix(chrome-extension): bind defaultChromeDebuggerApi methods to chrome.debugger

Returning methods from a Proxy via Reflect.get without binding causes
'Illegal invocation' at runtime because Chrome's native bindings check
this against the original chrome.debugger object. Replace the Proxy with
a plain object whose methods are explicitly bound.
…old (#24114)

* feat(chrome-extension-native-host): add native messaging helper scaffold

* fix(chrome-extension-native-host): robust port discovery, JSON error handling, and assistant terminology

- Add --assistant-port CLI arg so Chrome-spawned helpers can be pointed
  at a non-default port when the lockfile isn't present
- Surface malformed stdin JSON as a protocol-level error frame instead
  of a silent crash
- Rename user-facing 'daemon' to 'assistant' in error messages per
  AGENTS.md terminology rule

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(chrome-extension-native-host): finish daemon→assistant rename in client prose, vars, and smoke test

- README section header and prose use 'assistant' (per root AGENTS.md §139)
- DEFAULT_DAEMON_PORT → DEFAULT_ASSISTANT_PORT, resolveDaemonPort → resolveAssistantPort (per clients/AGENTS.md §403-404)
- Smoke test example uses dynamic import() instead of require() since the package is ESM

* fix(chrome-extension-native-host): flush stdout before exiting

Wait for process.stdout.write callback to fire before calling
process.exit(), so the native-messaging frame actually reaches Chrome
on pipe-backed stdout before the process terminates. Without this,
Chrome can see a disconnect instead of the intended token_response
or error frame under backpressure or larger payloads.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(chrome-extension): add cloud OAuth sign-in skeleton

* fix(chrome-extension): run OAuth sign-in from service worker and validate guardianId

- Popup now sends a message to the background worker to initiate cloud
  sign-in instead of running launchWebAuthFlow directly. This avoids
  the MV3 popup teardown race where the awaited OAuth promise never
  resolves if the popup blurs during the auth window.
- Add guardianId type check to getStoredToken so malformed stored
  tokens can't leak 'Signed in as guardian:undefined' into the popup UI.
…host proxy gating (#24111)

* feat(channels): add chrome-extension interface id and per-capability host proxy gating

* fix(channels): keep hostBrowserProxy available for non-interactive chrome-extension interfaces

updateClient/drain-queue paths used !isInteractive as a proxy for
hasNoClient, which incorrectly marks the chrome-extension's
hostBrowserProxy unavailable immediately after construction.
Decouple the flags: chrome-extension is non-interactive (no prompter
UI) but still has a connected client for host_browser_request events.

- conversation-routes.ts: derive hasNoClient as !(isInteractive || supportsHostProxy(sourceInterface, 'host_browser'))
- server.ts persistAndProcessMessage: same pattern so queued sends don't lose availability
- conversation-process.ts drain queue: add restore path via new Conversation.restoreBrowserProxyAvailability() helper
- conversation.ts: add restoreBrowserProxyAvailability() that re-enables only the browser proxy (gated on hasNoClient)
- channels/types.ts: clarify supportsHostProxy no-arg JSDoc to call out the desktop-only semantics
- conversation-confirmation-signals.test.ts: cover the new restore helper

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(channels): targeted hostBrowserProxy enable without relaxing hasNoClient

Cycle 1 derived hasNoClient as !(isInteractive || supportsHostProxy(id, 'host_browser')) to
keep the chrome-extension's browser proxy available. That inadvertently made tool gating treat
the conversation as fully interactive (isInteractive derives from !ctx.hasNoClient), enabling
host_bash/host_file tools that chrome-extension can't service.

Revert to the literal hasNoClient = !isInteractive and instead call a targeted
restoreBrowserProxyAvailability() after updateClient. The helper now enables the browser
proxy regardless of hasNoClient so the single-proxy chrome-extension turn works without
leaking host_bash/host_file tool availability.

Part of JARVIS-1175

* fix(channels): drop 'historically' from JSDoc and tighten chrome-extension else-if in server.ts

- assistant/AGENTS.md: comments describe current state, not history
- server.ts: scope the non-interactive host-browser restore branch to interfaces that
  specifically only support host_browser (not macos, which hits the interactive branch)

* test: add restoreBrowserProxyAvailability to Conversation mocks

Two test files use object-literal mocks for Conversation that need the
new method so they don't throw TypeError at the new call site in
handleSendMessage.

* fix(routes): optional-chain restoreBrowserProxyAvailability for test mocks

* test: allowlist chrome-extension-native-host in gateway-only guard

The native messaging helper intentionally POSTs to the local daemon's
/v1/browser-extension-pair endpoint on 127.0.0.1 to mint capability
tokens for the extension; it's a bootstrap path that cannot and should
not go through the gateway. Add it to the guard-test allowlist.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…on clients (#24129)

* feat(runtime): route host_browser_request to connected chrome-extension clients

* fix(runtime): gateway guardianId plumbing + queue-drain-safe chrome-extension sender

- handleBrowserRelayUpgrade now looks for x-guardian-id header/query param as a
  fallback when the JWT sub is a service token (gateway-forwarded case)
- Conversation exposes hostBrowserSenderOverride so restoreBrowserProxyAvailability
  preserves the registry-routed sender on drain-queue restores instead of clobbering
  it with the SSE hub sender
…proxy behind feature flag (#24125)

* feat(chrome-extension): dispatch host_browser_request frames via CDP proxy behind feature flag

* fix(chrome-extension): use camelCase wire format, tolerate re-attach, guard postResult catch

- Match daemon's actual host_browser_request envelope shape (requestId, cdpMethod,
  cdpParams, cdpSessionId — only timeout_seconds stays snake_case)
- POST /v1/host-browser-result with camelCase keys to match the runtime schema
- Track attached CDP targets and skip re-attach; dispose clears the set
- Wrap postResult calls inside the catch handler so a secondary failure is logged
  instead of becoming an unhandled rejection

* fix(chrome-extension): invalidate attachedTargets cache on debugger detach

Subscribe to CdpProxy.onDetach in the dispatcher and remove the
corresponding key from the attached-targets cache when Chrome notifies
us of a detach (tab close, navigation, infobar cancel, external
takeover). Without this, the cache held a stale entry forever and
subsequent commands skipped the re-attach, causing permanent CDP
failures.
…nt (#24130)

* feat(runtime): add /v1/browser-extension-pair capability token endpoint

* fix(runtime): align pair endpoint with native helper contract + move secret out of workspace

- Accept extensionOrigin (preferred) and origin (legacy) in request body
- Return expiresAt as ISO 8601 string instead of numeric ms, matching what the
  chrome-extension-native-host helper validates
- Move capabilityTokenSecret out of workspace/data into protected storage alongside
  the actor-token-signing-key per AGENTS.md workspace-isolation rule
- Fix isLoopbackHostHeader to correctly parse IPv6 bracket notation

* fix(runtime): align pair allowlist with native helper + reject malformed bracketed Host headers

- ALLOWED_EXTENSION_ORIGINS now matches the chrome-extension-native-host
  placeholder so the dev pair flow works end-to-end
- parseHostHeader rejects inputs like '[::1]attacker.com' where content
  after the closing bracket is not an optional ':port'
… install (#24128)

* feat(installer): write Chrome native messaging host manifest on macOS install

* fix(build): parenthesize native-host staleness check

Bash || and && are equal-precedence left-to-right, so the unparenthesized
condition incorrectly required bun.lock to also be newer for a package.json
update to trigger a rebuild. Group the bun.lock subexpression explicitly.

* fix(installer): conform InstallError to LocalizedError so localizedDescription is useful
…tive messaging (#24142)

* feat(chrome-extension): bootstrap self-hosted capability token via native messaging

* fix(chrome-extension): nativeMessaging permission, disconnect race, persistence fallback, popup->worker delegation

- Add nativeMessaging permission to manifest so Chrome actually allows
  chrome.runtime.connectNative('com.vellum.daemon')
- Set settled=true synchronously on token_response so a fast onDisconnect
  can't win the race and reject a valid pairing
- On chrome.storage.local.set failure, log and resolve with the in-memory
  token instead of discarding it (single-session fallback)
- Move the pair flow into the service worker via chrome.runtime.sendMessage
  so the popup teardown can't kill the awaited promise mid-flight
…ket (#24143)

* feat(chrome-extension): connect to cloud gateway browser-relay WebSocket

* fix(chrome-extension): surface missing-token connect failures and ignore stale socket close events

- Worker now returns an actionable error when the selected relay mode has
  no usable token (cloud not signed in, self-hosted not paired)
- RelayConnection's close listener ignores events from superseded sockets
  so a setMode mid-flight does not nuke the new socket reference
…est round-trip (#24153)

* test(host-browser): e2e smoke test for cloud-hosted host_browser_request round-trip

* test(host-browser): exercise actual timeout path and clarify mock WS header support

- Disconnected test renamed/restructured to use a never-resolving CDP handler
  plus a short timeout_seconds, so the proxy's setTimeout path is actually
  covered
- Removed/implemented extraHandshakeHeaders on the mock fixture so the
  advertised API matches reality
@noanflaherty noanflaherty self-assigned this Apr 7, 2026
@noanflaherty
Copy link
Copy Markdown
Contributor Author

Self-review is starting. Results will be posted here when complete.

devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1f9190ff33

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +583 to +584
.then((stored: StoredLocalToken) => sendResponse({ ok: true, token: stored }))
.catch((err) => sendResponse({ ok: false, error: err instanceof Error ? err.message : String(err) }));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reply to popup in self-hosted pair handler

The self-hosted-pair message path never calls the runtime message callback (sendResponseFn); it calls the relay socket helper sendResponse(...) instead. Because this branch returns true (async response expected), popup callers wait for a response that never arrives and end up with a runtime error, so local pairing appears to fail even when the native bootstrap succeeds.

Useful? React with 👍 / 👎.

Comment on lines +306 to +307
{ type: "token_response", token, expiresAt },
0,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Include guardianId in native token_response

The native host drops guardianId when it forwards /v1/browser-extension-pair output to Chrome, but bootstrapLocalToken() requires guardianId on token_response and rejects frames without it as malformed. In practice this means successful pair endpoint responses still fail the extension bootstrap path, so no local capability token is accepted/persisted.

Useful? React with 👍 / 👎.

Comment on lines +102 to +105
const [token, port] = await Promise.all([getBearerToken(), getRelayPort()]);
const headers: Record<string, string> = { 'content-type': 'application/json' };
if (token) headers.authorization = `Bearer ${token}`;
const resp = await fetch(`http://127.0.0.1:${port}/v1/host-browser-result`, {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Post host_browser results to active relay origin

Result delivery is hardcoded to http://127.0.0.1:<port>/v1/host-browser-result with the local bearer token, even when relay mode is cloud. In cloud mode, host_browser_request frames come from the gateway WebSocket, so posting responses to localhost causes delivery failures/timeouts (especially on machines without a local assistant), and the request never resolves server-side.

Useful? React with 👍 / 👎.

@noanflaherty
Copy link
Copy Markdown
Contributor Author

Self-review complete

Result: GAPS FOUND (Pass 3 incomplete due to interruption)

Pass 1 — External reviewer feedback: GAPS FOUND

12 actionable items remain after fix cycles, including 4 P0/P1 bugs:

P0 / P1 (functional / security):

  1. PR 4 (feat(channels): add chrome-extension interface id and per-capability host proxy gating #24111) — host_bash/file/cu tools leak into chrome-extension sessions (Codex P1). isToolActiveForContext gates HOST_TOOL_NAMES on !ctx.hasNoClient alone — does not consult supportsHostProxy(transportInterface, toolName). Fix: extend tool gate to call per-capability supportsHostProxy for host tools.
  2. PR 7 (feat(chrome-extension-native-host): add native messaging helper scaffold #24114) — Unauthorized origin race in native helper (Devin P0 / Codex P1). The unauthorized-origin branch in clients/chrome-extension-native-host/src/index.ts calls writeFrameAndExit(...) but does not return; execution falls through to stdin listener setup. If Chrome has already written a request_token frame, the helper can POST it to /v1/browser-extension-pair despite the origin check. Fix: add explicit return and make writeFrameAndExit actually block (currently returns a never-awaited Promise).
  3. PR 13 (feat(chrome-extension): bootstrap self-hosted capability token via native messaging #24142) — Native host emits no guardianId so self-hosted pairing breaks (Codex P1). bootstrapLocalToken requires guardianId in the token_response frame, but the checked-in helper at clients/chrome-extension-native-host/src/index.ts only emits { type: "token_response", token, expiresAt }. Fix: source guardianId from the daemon's /v1/browser-extension-pair HTTP response and forward it in the native messaging frame.
  4. PR 14 (feat(chrome-extension): connect to cloud gateway browser-relay WebSocket #24143) — Popup pairing hangs because of sendResponse/sendResponseFn rename miss (Devin P0). The self-hosted-pair handler in clients/chrome-extension/background/worker.ts still calls the unshadowed module-level sendResponse (which resolves to the WebSocket-relay function) instead of the renamed Chrome runtime callback sendResponseFn. Fix: rename the two callsites in the self-hosted-pair handler.

P2 / quality:
5. PR 6 (#24112)cdp-proxy.ts send() Promise constructor catches sync targetToDebuggee throws as rejections; should resolve with error frame.
6. PR 9 (#24125) — host-browser-dispatcher caches target on "already attached" errors but doesn't evict if subsequent send fails.
7. PR 14 (#24143)missingTokenMessage returns "Vellum daemon" in user-facing text; should be "Vellum assistant" per clients/AGENTS.md.
8. PR 14 (#24143)clients/chrome-extension/tsconfig.json excludes worker.ts and popup.ts from typecheck. Underlying infra issue that enabled #4 to escape review.
9. PR 2 (#24115) — host-bash-proxy timeout test uses real timers with timeout_seconds: -2.99 hack; sibling tests use jest.useFakeTimers() for consistency.
10. PR 16 (#24154) — Test file header narrates history via PR numbers (PR 7/11/13/15/16); violates assistant/AGENTS.md comment rule.
11. PR 16 (#24154) — Uses test.skip for Phase 3 placeholder; root AGENTS.md requires test.todo.

Pass 2 — Plan faithfulness: GAPS FOUND (1 item)

  1. PR 6 (feat(chrome-extension): add standalone CDP proxy module #24112)cdp-proxy.ts shipped without unit tests despite the plan's PR 6 acceptance criteria explicitly requiring 7 test cases. The rationale comment ("tests will be added once a test runner is configured") is now stale: PRs 8/9/13/14 all added bun:test tests in the same __tests__/ directory.

Pass 3 — Repo integration review: NOT COMPLETED

Interrupted before completion. The Pass 1 + Pass 2 findings above include the integration concerns that Pass 3 would have flagged most prominently (host tool leak in PR 4, wire format coordination in PR 9/10, popup-worker delegation miss in PR 14).

Recommendation

Address the 4 P0/P1 items before merging to main. The other 7 P2 items can land in a follow-up cleanup PR. Self-hosted pairing and the unauthorized-origin path will not work end-to-end without items 2, 3, and 4.

Feature branch is ready for your review. Per --auto-merge not being set, this PR is presented for manual merge.

@noanflaherty
Copy link
Copy Markdown
Contributor Author

@codex review this PR again — 3 P1 regressions and 4 P2/P3 cleanups from the round-2 self-review have been addressed across 6 follow-up PRs (#24224, #24225, #24226, #24227, #24228, #24229). Latest commit: 98b776f

@noanflaherty
Copy link
Copy Markdown
Contributor Author

@devin review this PR again — 3 P1 regressions and 4 P2/P3 cleanups from the round-2 self-review have been addressed across 6 follow-up PRs (#24224, #24225, #24226, #24227, #24228, #24229). Latest commit: 98b776f

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

await chrome.storage.local.set(storageUpdate);
chrome.runtime.sendMessage({ type: 'connect' }, (response: { ok: boolean; error?: string }) => {

P2 Badge Persist selected relay mode before sending connect

The popup computes relayMode from the radio state but never writes that value as part of the same connect flow before calling runtime.sendMessage({ type: 'connect' }). If the user toggles mode and immediately clicks Connect, the worker can still use its stale in-memory relayMode (updated only via async storage.onChanged) and attempt the wrong transport, producing incorrect self-hosted/cloud connection behavior.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

) {
const envelopeType = (parsed as { type: string }).type;
if (envelopeType === 'host_browser_request') {
if (!cdpProxyEnabled) return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reject disabled host_browser requests explicitly

When cdpProxyEnabled is false (the default when vellum.cdpProxyEnabled is unset), host_browser_request frames are dropped with an early return. The daemon still advertises host_browser for the chrome-extension transport, so these calls wait for the full proxy timeout and fail as intermittent tool hangs instead of an immediate, actionable error. This creates a production-facing regression for any extension session that has not enabled the beta toggle.

Useful? React with 👍 / 👎.

@noanflaherty
Copy link
Copy Markdown
Contributor Author

Round 3 fix-ups complete

All 4 round-3 findings have been addressed. Feature branch tip is now 9996168c0.

Round-3 PRs merged

# PR Addresses Severity
1 #24233 Rewrite host-tool gating test comment in forward-looking voice P1 (style)
2 #24234 Tighten bun:test shim so only test.todo has optional callback P2
3 #24235 Dedupe RelayConnection.mode accessor (keep getCurrentMode) P3
4 #24236 Worker reads live relay mode from storage on connect() P3

Cumulative totals

Total: 34 merged PRs on the feature branch.

Remaining deferred items (Phase 3 scope)

  1. Gateway service-token loses guardianId on upstream WebSocket relay — cloud mode is feature-flagged off in Phase 2
  2. Runtime's browser-extension-relay/server.ts::handleMessage doesn't yet discriminate host_browser_result envelopes — self-documented as Phase 3 work
  3. ALLOWED_EXTENSION_ORIGINS contains dev placeholder aaaa...aaaa in 3 locations — pre-release TODO
  4. x-guardian-id header trust in http-server.ts::handleBrowserRelayUpgrade — Phase 3 should derive from edge token

Feature branch status

Ready for manual review. The 3 P1 regressions introduced by round-1 fixes are resolved, verified, and pinned with regression tests. All in-scope P2/P3 cleanups have landed. Code quality on the feature branch is now significantly higher than when round-2 review started.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment on lines 533 to +549
if (HOST_TOOL_NAMES.has(name)) {
// Host tools require a connected client — without one, there is no human
// to approve execution and the guardian auto-approve path would allow
// unchecked host command execution on the daemon host.
const capability = HOST_TOOL_TO_CAPABILITY.get(name);
const transport = ctx.transportInterface;

// Per-capability check is authoritative for structural support: if the
// transport cannot service this capability, the tool is filtered out.
if (transport && capability && !supportsHostProxy(transport, capability)) {
return false;
}

// chrome-extension is its own executor — the extension's popup gates
// commands via its own UI, and the transport does not use an SSE-level
// interactive approval channel. hasNoClient is intentionally `true` for
// chrome-extension turns (chrome-extension is not in INTERACTIVE_INTERFACES)
// and must not gate host_browser. Trust the per-capability check.
if (transport === "chrome-extension") {
return true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 HOST_TOOL_NAMES and HOST_TOOL_TO_CAPABILITY sets must stay in sync for chrome-extension safety

The isToolActiveForContext function at assistant/src/daemon/conversation-tool-setup.ts:533-555 has a structural fragility: if a future tool is added to HOST_TOOL_NAMES (line 481-487) but NOT to HOST_TOOL_TO_CAPABILITY (line 499-505), the per-capability check at line 539 is skipped (because capability is undefined/falsy), and the transport === "chrome-extension" check at line 548 unconditionally returns true. This would silently grant chrome-extension access to a host tool it cannot service. Currently all five entries are mapped so this is not a live bug, but the two data structures have no compile-time or runtime enforcement that they stay in sync. A guard test or assertion would prevent this from regressing.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@noanflaherty noanflaherty merged commit 3031a18 into main Apr 8, 2026
13 of 14 checks passed
@noanflaherty noanflaherty deleted the noanflaherty/host-browser-proxy-phase-2 branch April 8, 2026 02:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant