Skip to content

fix(daemon): gate host tools by per-capability supportsHostProxy#24195

Merged
noanflaherty merged 1 commit into
noanflaherty/host-browser-proxy-phase-2from
phase-2-fixes/pr-1-host-tool-capability-gate
Apr 8, 2026
Merged

fix(daemon): gate host tools by per-capability supportsHostProxy#24195
noanflaherty merged 1 commit into
noanflaherty/host-browser-proxy-phase-2from
phase-2-fixes/pr-1-host-tool-capability-gate

Conversation

@noanflaherty
Copy link
Copy Markdown
Contributor

@noanflaherty noanflaherty commented Apr 8, 2026

Summary

  • Adds host_browser to HOST_TOOL_NAMES and gates each host tool on supportsHostProxy(transport, capability)
  • Fixes Codex P1: host_bash/file leaked into chrome-extension sessions
  • Adds unit tests covering all transport × capability combinations

Addresses gap 1 from PR #24159 self-review.


Open with Devin

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4630809fb6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// to approve execution and the guardian auto-approve path would allow
// unchecked host command execution on the daemon host.
return !ctx.hasNoClient;
if (ctx.hasNoClient) return false;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Allow host_browser when non-interactive client is connected

This new host-tool branch still hard-fails on hasNoClient, which unintentionally removes host_browser for chrome-extension turns. In runtime routing, chrome-extension is intentionally marked non-interactive (hasNoClient = true) while still wiring a live browser proxy, so this condition now prevents the model from ever seeing host_browser even though supportsHostProxy("chrome-extension", "host_browser") is true. That regresses browser automation for extension sessions while fixing the bash/file leak.

Useful? React with 👍 / 👎.

@noanflaherty noanflaherty merged commit 8470237 into noanflaherty/host-browser-proxy-phase-2 Apr 8, 2026
12 checks passed
@noanflaherty noanflaherty deleted the phase-2-fixes/pr-1-host-tool-capability-gate branch April 8, 2026 01:02
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 potential issues.

View 2 additional findings in Devin Review.

Open in Devin Review

Comment on lines 534 to +551
// Host tools require a connected client — without one, there is no human
// to approve execution and the guardian auto-approve path would allow
// unchecked host command execution on the daemon host.
return !ctx.hasNoClient;
if (ctx.hasNoClient) return false;
// Then gate per-capability against the connected interface so that
// partial-capability transports (e.g. chrome-extension only supports
// `host_browser`) only see host tools their interface can service.
// Without this check, host_bash/host_file_* would surface for
// chrome-extension sessions and the model would attempt calls the
// transport cannot dispatch.
const capability = HOST_TOOL_TO_CAPABILITY.get(name);
const transport = ctx.transportInterface;
// Backwards-compat fallback: contexts that have not yet been plumbed
// with a transport interface (tests, callers that don't pass the new
// field) keep the coarse-grained behavior so we don't accidentally hide
// tools from them.
if (!transport || !capability) return true;
return supportsHostProxy(transport, capability);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Adding host_browser to HOST_TOOL_NAMES breaks chrome-extension sessions where hasNoClient === true

Before this PR, host_browser was not in HOST_TOOL_NAMES, so isToolActiveForContext("host_browser", ctx) fell through to return true — always active regardless of hasNoClient. This PR adds host_browser to HOST_TOOL_NAMES (line 486), which subjects it to the if (ctx.hasNoClient) return false gate at line 537.

For chrome-extension sessions arriving through the SSE sendMessageToConversation path, hasNoClient is intentionally kept true. The comment at assistant/src/daemon/server.ts:1226-1229 explains: "We do NOT call updateClient(onEvent, false) for that case, because flipping hasNoClient false would also enable host_bash/host_file/host_cu tool gating." And assistant/src/daemon/conversation.ts:569-570 confirms: "The chrome-extension interface is non-interactive (so hasNoClient === true), but it DOES have a connected client that can service host_browser_request events."

With hasNoClient === true and host_browser now in HOST_TOOL_NAMES, the early return false fires before the new per-capability supportsHostProxy check is ever reached. The test at line 66-76 masks this by using hasNoClient: false, which doesn't match the actual runtime state for chrome-extension sessions on the SSE path.

Affected code path in isToolActiveForContext
if (HOST_TOOL_NAMES.has(name)) {
    if (ctx.hasNoClient) return false; // ← blocks host_browser for chrome-extension
    // per-capability check never reached
    const capability = HOST_TOOL_TO_CAPABILITY.get(name);
    const transport = ctx.transportInterface;
    if (!transport || !capability) return true;
    return supportsHostProxy(transport, capability);
}

(Refers to lines 486-551)

Prompt for agents
The problem is that host_browser was added to HOST_TOOL_NAMES, which gates it behind the hasNoClient check. But for chrome-extension sessions on the SSE sendMessageToConversation path (server.ts:1232-1240), hasNoClient is intentionally kept true — the code deliberately avoids calling updateClient(onEvent, false) to prevent other host tools from leaking.

Now that per-capability gating via supportsHostProxy(transport, capability) is in place, there are two possible approaches:

1. In isToolActiveForContext, restructure the HOST_TOOL_NAMES branch so that when transportInterface is available, the per-capability check runs BEFORE (or instead of) the hasNoClient check. The hasNoClient check could be the fallback only when no transport is known. This way chrome-extension with hasNoClient=true but transportInterface='chrome-extension' would still pass host_browser through supportsHostProxy.

2. Update server.ts:1232-1240 (sendMessageToConversation) and conversation-process.ts drain queue logic to call updateClient(onEvent, false) for chrome-extension now that per-capability gating handles the host_bash/host_file differentiation. This would set hasNoClient=false for chrome-extension, matching the test expectation. But this has broader implications for isInteractive and other hasNoClient-gated behavior.

Approach 1 is safer and more localized. The test at conversation-tool-setup.test.ts:66-76 should also be updated to cover the actual runtime state (hasNoClient: true with transportInterface: 'chrome-extension').
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +66 to +76
test("host_browser is active for chrome-extension transport", () => {
expect(
isToolActiveForContext(
"host_browser",
makeCtx({
hasNoClient: false,
transportInterface: "chrome-extension",
}),
),
).toBe(true);
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Test uses hasNoClient: false for chrome-extension, masking the real runtime state

The test at line 66-76 (host_browser is active for chrome-extension transport) uses hasNoClient: false, but the actual runtime state for chrome-extension sessions through the SSE sendMessageToConversation path keeps hasNoClient === true (see assistant/src/daemon/server.ts:1226-1229 and assistant/src/daemon/conversation.ts:569-570). This means the test passes but doesn't cover the real-world scenario. A test case with { hasNoClient: true, transportInterface: "chrome-extension" } expecting host_browser to be true would catch the reported bug.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines 534 to +551
// Host tools require a connected client — without one, there is no human
// to approve execution and the guardian auto-approve path would allow
// unchecked host command execution on the daemon host.
return !ctx.hasNoClient;
if (ctx.hasNoClient) return false;
// Then gate per-capability against the connected interface so that
// partial-capability transports (e.g. chrome-extension only supports
// `host_browser`) only see host tools their interface can service.
// Without this check, host_bash/host_file_* would surface for
// chrome-extension sessions and the model would attempt calls the
// transport cannot dispatch.
const capability = HOST_TOOL_TO_CAPABILITY.get(name);
const transport = ctx.transportInterface;
// Backwards-compat fallback: contexts that have not yet been plumbed
// with a transport interface (tests, callers that don't pass the new
// field) keep the coarse-grained behavior so we don't accidentally hide
// tools from them.
if (!transport || !capability) return true;
return supportsHostProxy(transport, capability);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 server.ts hasNoClient workaround for chrome-extension is now potentially obsolete

The code at assistant/src/daemon/server.ts:1221-1240 deliberately avoids calling updateClient(onEvent, false) for chrome-extension to prevent host_bash/host_file tool gating from triggering. Now that per-capability gating via supportsHostProxy(transport, capability) is in place, this workaround may be unnecessary — the per-capability check would correctly block host_bash/host_file for chrome-extension even if hasNoClient were false. Cleaning this up (and setting hasNoClient = false for chrome-extension) would fix the reported bug and simplify the state model. However, there may be other hasNoClient-gated behaviors (e.g. isInteractive at conversation-tool-setup.ts:221) that need auditing first.

(Refers to lines 533-551)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

noanflaherty added a commit that referenced this pull request Apr 8, 2026
* chore: regenerate openapi.yaml for version 0.6.2 bump

The main-branch release commit (#24108) bumped assistant/package.json to
0.6.2 but did not regenerate the openapi spec. Regenerate it on the feature
branch so CI's OpenAPI Spec Check passes for Phase 2 PRs.

* fix(daemon): backport host-browser-proxy defensive guards to host-bash/file/cu proxies (#24115)

* docs(browser): document chrome.debugger infobar decision (#24106)

* feat(clients/macos): decode host_browser_request and host_browser_cancel messages (#24113)

* feat(clients/macos): decode host_browser_request and host_browser_cancel messages

* fix: type HostBrowserRequest.timeoutSeconds as Double?

Matches the daemon's number-typed wire contract and mirrors
HostBashRequest.timeoutSeconds, so fractional timeouts like 0.01s don't
throw a type-mismatch and drop the whole host_browser_request event.

* feat(browser-session): add BrowserSessionManager scaffold with extension backend stub (#24110)

* feat(browser-session): add BrowserSessionManager scaffold with extension backend stub

* test(browser-session): import public API via index.ts to satisfy knip

Updates manager.test.ts to consume BrowserSessionManager, createExtensionBackend,
and types through the public ../index.js entry point instead of deep-importing
../manager.js and ../backends/extension.js. This keeps knip happy during the
scaffold phase: index.ts becomes a transitively-reachable entry point from
src/**/__tests__/**/*.ts before any production module consumes it.

* fix(browser-session): enforce session existence in BrowserSessionManager.send

Throws when the caller passes a sessionId that doesn't exist or has
been disposed. Still advisory for single-backend Phase 2, but makes
disposeSession() an actual enforcement boundary so commands can't run
against stale ids once Phase 4 adds multi-backend routing.

* feat(chrome-extension): add standalone CDP proxy module (#24112)

* feat(chrome-extension): add standalone CDP proxy module

* fix(chrome-extension): inject runtime.lastError and thread sessionId through CDP proxy

- Add runtime.lastError to ChromeDebuggerApi so mocked tests can surface errors
- Fold frame.sessionId into sendCommand params for flat-session routing
- Extract sessionId from event params when building CdpEventFrame
- Document flat-session handling in the module docstring

* fix(chrome-extension): route flat-session sessionId through DebuggerSession target

Chrome 125+ debugger.sendCommand takes sessionId on the target argument
(DebuggerSession), not inside commandParams. Switch back to passing
sessionId on the target. Same change on the onEvent listener — read
sessionId from 'source' rather than params, since flat-session events
surface it on the source.

Also clean up the module docstring to drop PR-level narrative per
clients/AGENTS.md's comment quality rule.

* fix(chrome-extension): bind defaultChromeDebuggerApi methods to chrome.debugger

Returning methods from a Proxy via Reflect.get without binding causes
'Illegal invocation' at runtime because Chrome's native bindings check
this against the original chrome.debugger object. Replace the Proxy with
a plain object whose methods are explicitly bound.

* feat(chrome-extension-native-host): add native messaging helper scaffold (#24114)

* feat(chrome-extension-native-host): add native messaging helper scaffold

* fix(chrome-extension-native-host): robust port discovery, JSON error handling, and assistant terminology

- Add --assistant-port CLI arg so Chrome-spawned helpers can be pointed
  at a non-default port when the lockfile isn't present
- Surface malformed stdin JSON as a protocol-level error frame instead
  of a silent crash
- Rename user-facing 'daemon' to 'assistant' in error messages per
  AGENTS.md terminology rule

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(chrome-extension-native-host): finish daemon→assistant rename in client prose, vars, and smoke test

- README section header and prose use 'assistant' (per root AGENTS.md §139)
- DEFAULT_DAEMON_PORT → DEFAULT_ASSISTANT_PORT, resolveDaemonPort → resolveAssistantPort (per clients/AGENTS.md §403-404)
- Smoke test example uses dynamic import() instead of require() since the package is ESM

* fix(chrome-extension-native-host): flush stdout before exiting

Wait for process.stdout.write callback to fire before calling
process.exit(), so the native-messaging frame actually reaches Chrome
on pipe-backed stdout before the process terminates. Without this,
Chrome can see a disconnect instead of the intended token_response
or error frame under backpressure or larger payloads.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(chrome-extension): add cloud OAuth sign-in skeleton (#24117)

* feat(chrome-extension): add cloud OAuth sign-in skeleton

* fix(chrome-extension): run OAuth sign-in from service worker and validate guardianId

- Popup now sends a message to the background worker to initiate cloud
  sign-in instead of running launchWebAuthFlow directly. This avoids
  the MV3 popup teardown race where the awaited OAuth promise never
  resolves if the popup blurs during the auth window.
- Add guardianId type check to getStoredToken so malformed stored
  tokens can't leak 'Signed in as guardian:undefined' into the popup UI.

* feat(channels): add chrome-extension interface id and per-capability host proxy gating (#24111)

* feat(channels): add chrome-extension interface id and per-capability host proxy gating

* fix(channels): keep hostBrowserProxy available for non-interactive chrome-extension interfaces

updateClient/drain-queue paths used !isInteractive as a proxy for
hasNoClient, which incorrectly marks the chrome-extension's
hostBrowserProxy unavailable immediately after construction.
Decouple the flags: chrome-extension is non-interactive (no prompter
UI) but still has a connected client for host_browser_request events.

- conversation-routes.ts: derive hasNoClient as !(isInteractive || supportsHostProxy(sourceInterface, 'host_browser'))
- server.ts persistAndProcessMessage: same pattern so queued sends don't lose availability
- conversation-process.ts drain queue: add restore path via new Conversation.restoreBrowserProxyAvailability() helper
- conversation.ts: add restoreBrowserProxyAvailability() that re-enables only the browser proxy (gated on hasNoClient)
- channels/types.ts: clarify supportsHostProxy no-arg JSDoc to call out the desktop-only semantics
- conversation-confirmation-signals.test.ts: cover the new restore helper

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(channels): targeted hostBrowserProxy enable without relaxing hasNoClient

Cycle 1 derived hasNoClient as !(isInteractive || supportsHostProxy(id, 'host_browser')) to
keep the chrome-extension's browser proxy available. That inadvertently made tool gating treat
the conversation as fully interactive (isInteractive derives from !ctx.hasNoClient), enabling
host_bash/host_file tools that chrome-extension can't service.

Revert to the literal hasNoClient = !isInteractive and instead call a targeted
restoreBrowserProxyAvailability() after updateClient. The helper now enables the browser
proxy regardless of hasNoClient so the single-proxy chrome-extension turn works without
leaking host_bash/host_file tool availability.

Part of JARVIS-1175

* fix(channels): drop 'historically' from JSDoc and tighten chrome-extension else-if in server.ts

- assistant/AGENTS.md: comments describe current state, not history
- server.ts: scope the non-interactive host-browser restore branch to interfaces that
  specifically only support host_browser (not macos, which hits the interactive branch)

* test: add restoreBrowserProxyAvailability to Conversation mocks

Two test files use object-literal mocks for Conversation that need the
new method so they don't throw TypeError at the new call site in
handleSendMessage.

* fix(routes): optional-chain restoreBrowserProxyAvailability for test mocks

* test: allowlist chrome-extension-native-host in gateway-only guard

The native messaging helper intentionally POSTs to the local daemon's
/v1/browser-extension-pair endpoint on 127.0.0.1 to mint capability
tokens for the extension; it's a bootstrap path that cannot and should
not go through the gateway. Add it to the guard-test allowlist.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(runtime): route host_browser_request to connected chrome-extension clients (#24129)

* feat(runtime): route host_browser_request to connected chrome-extension clients

* fix(runtime): gateway guardianId plumbing + queue-drain-safe chrome-extension sender

- handleBrowserRelayUpgrade now looks for x-guardian-id header/query param as a
  fallback when the JWT sub is a service token (gateway-forwarded case)
- Conversation exposes hostBrowserSenderOverride so restoreBrowserProxyAvailability
  preserves the registry-routed sender on drain-queue restores instead of clobbering
  it with the SSE hub sender

* feat(chrome-extension): dispatch host_browser_request frames via CDP proxy behind feature flag (#24125)

* feat(chrome-extension): dispatch host_browser_request frames via CDP proxy behind feature flag

* fix(chrome-extension): use camelCase wire format, tolerate re-attach, guard postResult catch

- Match daemon's actual host_browser_request envelope shape (requestId, cdpMethod,
  cdpParams, cdpSessionId — only timeout_seconds stays snake_case)
- POST /v1/host-browser-result with camelCase keys to match the runtime schema
- Track attached CDP targets and skip re-attach; dispose clears the set
- Wrap postResult calls inside the catch handler so a secondary failure is logged
  instead of becoming an unhandled rejection

* fix(chrome-extension): invalidate attachedTargets cache on debugger detach

Subscribe to CdpProxy.onDetach in the dispatcher and remove the
corresponding key from the attached-targets cache when Chrome notifies
us of a detach (tab close, navigation, infobar cancel, external
takeover). Without this, the cache held a stale entry forever and
subsequent commands skipped the re-attach, causing permanent CDP
failures.

* feat(runtime): add /v1/browser-extension-pair capability token endpoint (#24130)

* feat(runtime): add /v1/browser-extension-pair capability token endpoint

* fix(runtime): align pair endpoint with native helper contract + move secret out of workspace

- Accept extensionOrigin (preferred) and origin (legacy) in request body
- Return expiresAt as ISO 8601 string instead of numeric ms, matching what the
  chrome-extension-native-host helper validates
- Move capabilityTokenSecret out of workspace/data into protected storage alongside
  the actor-token-signing-key per AGENTS.md workspace-isolation rule
- Fix isLoopbackHostHeader to correctly parse IPv6 bracket notation

* fix(runtime): align pair allowlist with native helper + reject malformed bracketed Host headers

- ALLOWED_EXTENSION_ORIGINS now matches the chrome-extension-native-host
  placeholder so the dev pair flow works end-to-end
- parseHostHeader rejects inputs like '[::1]attacker.com' where content
  after the closing bracket is not an optional ':port'

* feat(installer): write Chrome native messaging host manifest on macOS install (#24128)

* feat(installer): write Chrome native messaging host manifest on macOS install

* fix(build): parenthesize native-host staleness check

Bash || and && are equal-precedence left-to-right, so the unparenthesized
condition incorrectly required bun.lock to also be newer for a package.json
update to trigger a rebuild. Group the bun.lock subexpression explicitly.

* fix(installer): conform InstallError to LocalizedError so localizedDescription is useful

* feat(chrome-extension): bootstrap self-hosted capability token via native messaging (#24142)

* feat(chrome-extension): bootstrap self-hosted capability token via native messaging

* fix(chrome-extension): nativeMessaging permission, disconnect race, persistence fallback, popup->worker delegation

- Add nativeMessaging permission to manifest so Chrome actually allows
  chrome.runtime.connectNative('com.vellum.daemon')
- Set settled=true synchronously on token_response so a fast onDisconnect
  can't win the race and reject a valid pairing
- On chrome.storage.local.set failure, log and resolve with the in-memory
  token instead of discarding it (single-session fallback)
- Move the pair flow into the service worker via chrome.runtime.sendMessage
  so the popup teardown can't kill the awaited promise mid-flight

* feat(chrome-extension): connect to cloud gateway browser-relay WebSocket (#24143)

* feat(chrome-extension): connect to cloud gateway browser-relay WebSocket

* fix(chrome-extension): surface missing-token connect failures and ignore stale socket close events

- Worker now returns an actionable error when the selected relay mode has
  no usable token (cloud not signed in, self-hosted not paired)
- RelayConnection's close listener ignores events from superseded sockets
  so a setMode mid-flight does not nuke the new socket reference

* test(host-browser): e2e smoke test for self-hosted native-messaging capability bootstrap (#24154)

* test(host-browser): e2e smoke test for cloud-hosted host_browser_request round-trip (#24153)

* test(host-browser): e2e smoke test for cloud-hosted host_browser_request round-trip

* test(host-browser): exercise actual timeout path and clarify mock WS header support

- Disconnected test renamed/restructured to use a never-resolving CDP handler
  plus a short timeout_seconds, so the proxy's setTimeout path is actually
  covered
- Removed/implemented extraHandshakeHeaders on the mock fixture so the
  advertised API matches reality

* test(cdp-proxy): add unit tests and fix sync targetToDebuggee throw (#24187)

* fix(chrome-extension): evict attached-target cache on CDP send failure (#24188)

* test(host-browser-e2e): rewrite header and convert test.skip to test.todo (#24190)

* test(host-bash-proxy): use bun:test fake timers for timeout regression test (#24189)

* fix(chrome-extension): popup pairing reply + relay-aware host_browser result POST (#24194)

* fix(chrome-extension-native-host): halt unauthorized origins and forward guardianId (#24192)

* fix(daemon): gate host tools by per-capability supportsHostProxy (#24195)

* chore(chrome-extension): typecheck worker.ts + popup.ts and use "assistant" terminology (#24199)

* fix(chrome-extension): popup connect handler honors selected relay mode (#24225)

* chore(chrome-extension): extend bun:test ambient shim with common symbols (#24226)

* fix(daemon): preserve host_browser for chrome-extension in per-capability tool gate (#24224)

* fix(chrome-extension): read live relay mode per request + defensive worker cleanups (#24227)

* chore(chrome-extension): remove stale cdp-proxy declarations and outdated comment (#24228)

* chore(chrome-extension-native-host): split writeFrameAndExit + rewrite history-narrating docstrings (#24229)

* chore(chrome-extension): tighten bun:test shim so only test.todo has optional callback (#24234)

* chore(daemon): rewrite host-tool gating test comment in forward-looking voice (#24233)

* chore(chrome-extension): dedupe RelayConnection.mode accessor (keep getCurrentMode) (#24235)

* fix(chrome-extension): worker reads live relay mode from storage on connect (#24236)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant