Skip to content

fix(chrome-extension): popup pairing reply + relay-aware host_browser result POST#24194

Merged
noanflaherty merged 2 commits into
noanflaherty/host-browser-proxy-phase-2from
phase-2-fixes/pr-3-worker-p1-fixes
Apr 8, 2026
Merged

fix(chrome-extension): popup pairing reply + relay-aware host_browser result POST#24194
noanflaherty merged 2 commits into
noanflaherty/host-browser-proxy-phase-2from
phase-2-fixes/pr-3-worker-p1-fixes

Conversation

@noanflaherty
Copy link
Copy Markdown
Contributor

@noanflaherty noanflaherty commented Apr 8, 2026

Summary

  • Fixes Devin P0 / Codex P1: self-hosted-pair branch now calls sendResponseFn (the chrome.runtime callback) instead of the module-level WebSocket relay sendResponse, so the popup pairing promise actually resolves
  • Fixes Devin/Codex P1: postHostBrowserResult is now relay-aware. In cloud mode it sends results over the WebSocket (or no-ops if unsupported by the gateway today). In self-hosted mode it continues POSTing to the local daemon
  • Adds unit tests for both modes

Addresses gaps 4 and 5 from PR #24159 self-review.


Open with Devin

@noanflaherty noanflaherty merged commit be9c866 into noanflaherty/host-browser-proxy-phase-2 Apr 8, 2026
1 check passed
@noanflaherty noanflaherty deleted the phase-2-fixes/pr-3-worker-p1-fixes branch April 8, 2026 00:58
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6410fa29f5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +126 to +127
if (activeRelayMode) {
return postHostBrowserResult(activeRelayMode, relayConnection, result);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Refresh host-browser POST token after reconnect

dispatchHostBrowserResult sends results with the activeRelayMode snapshot captured in connect(), but that snapshot is never updated when the self-hosted socket reconnects and rotates credentials. In the reconnect path, RelayConnection can replace its internal mode.token after onReconnect returns a fresh token, yet this call still posts with the stale bearer token from the old snapshot, so /v1/host-browser-result can start returning 401s after any token refresh/abnormal reconnect cycle. This regresses the previous behavior that read current token state per request.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 2 additional findings in Devin Review.

Open in Devin Review

Comment on lines +126 to +127
if (activeRelayMode) {
return postHostBrowserResult(activeRelayMode, relayConnection, result);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 activeRelayMode token becomes stale after WebSocket reconnection with token refresh

activeRelayMode is captured once at connect() time (worker.ts:280) and never updated when the RelayConnection internally refreshes the token during reconnection. When the WebSocket drops with a non-normal close code, scheduleReconnectWithRefresh (relay-connection.ts:233-256) calls the onReconnect callback (worker.ts:226-237), which runs refreshToken() to store a new token in chrome.storage.local and returns it. The RelayConnection updates its internal deps.mode.token, but activeRelayMode in worker.ts retains the original token.

When dispatchHostBrowserResult (worker.ts:126-127) is subsequently called, it passes the stale activeRelayMode to postHostBrowserResult, which uses mode.token for the HTTP Authorization: Bearer header in self-hosted mode (relay-connection.ts:319). The old token may have been invalidated by the daemon, causing silent 401/403 failures on the /v1/host-browser-result POST — meaning CDP results are silently dropped.

This is a behavioral regression: the old postHostBrowserResult (worker.ts:101-114, deleted in this PR) called getBearerToken() on every invocation, always fetching the freshest token from chrome.storage.local.

Prompt for agents
The `activeRelayMode` variable is set once at connect time and never updated when the RelayConnection internally refreshes the token during reconnection. This means `dispatchHostBrowserResult` uses a stale token for the self-hosted HTTP POST Authorization header.

Two possible approaches:

1. In `dispatchHostBrowserResult`, instead of using `activeRelayMode` directly, read the live token from `relayConnection.mode` (which IS updated by the reconnect handler) and construct the RelayMode from that. Something like: `const liveMode = relayConnection?.mode ?? activeRelayMode`. The `RelayConnection.mode` getter returns `this.deps.mode` which is kept up-to-date by `scheduleReconnectWithRefresh`.

2. Subscribe to token changes: have the `onReconnect` callback in `createRelayConnection` also update `activeRelayMode` when a new token is returned. This could be done by making the callback update the module-level variable, e.g. adding `if (refreshed) activeRelayMode = { ...activeRelayMode!, token: refreshed };` after the refresh succeeds.

Approach 1 is simpler and more robust since it always reads the latest state from the connection itself.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +304 to +312
if (mode.kind === 'cloud') {
if (!connection || !connection.isOpen()) {
console.warn(
'[vellum-relay] host-browser-result dropped: cloud relay not connected',
);
return;
}
connection.send(JSON.stringify({ type: 'host_browser_result', ...result }));
return;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Cloud mode result posting silently drops results when connection is temporarily down

In cloud mode, if the WebSocket is temporarily disconnected (e.g., during reconnection backoff), postHostBrowserResult at relay-connection.ts:305-309 logs a warning and returns without delivering the result. Unlike self-hosted mode which has an HTTP fallback, there's no retry or queuing mechanism for cloud mode. This means CDP results generated during a brief reconnection window are permanently lost. The docstring explicitly acknowledges this as intentional for Phase 2 (the cloud CDP path is feature-flagged off), but it's worth noting for Phase 3 when the cloud path goes live — a queue-and-retry or at minimum a more prominent error propagation may be needed.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

noanflaherty added a commit that referenced this pull request Apr 8, 2026
* chore: regenerate openapi.yaml for version 0.6.2 bump

The main-branch release commit (#24108) bumped assistant/package.json to
0.6.2 but did not regenerate the openapi spec. Regenerate it on the feature
branch so CI's OpenAPI Spec Check passes for Phase 2 PRs.

* fix(daemon): backport host-browser-proxy defensive guards to host-bash/file/cu proxies (#24115)

* docs(browser): document chrome.debugger infobar decision (#24106)

* feat(clients/macos): decode host_browser_request and host_browser_cancel messages (#24113)

* feat(clients/macos): decode host_browser_request and host_browser_cancel messages

* fix: type HostBrowserRequest.timeoutSeconds as Double?

Matches the daemon's number-typed wire contract and mirrors
HostBashRequest.timeoutSeconds, so fractional timeouts like 0.01s don't
throw a type-mismatch and drop the whole host_browser_request event.

* feat(browser-session): add BrowserSessionManager scaffold with extension backend stub (#24110)

* feat(browser-session): add BrowserSessionManager scaffold with extension backend stub

* test(browser-session): import public API via index.ts to satisfy knip

Updates manager.test.ts to consume BrowserSessionManager, createExtensionBackend,
and types through the public ../index.js entry point instead of deep-importing
../manager.js and ../backends/extension.js. This keeps knip happy during the
scaffold phase: index.ts becomes a transitively-reachable entry point from
src/**/__tests__/**/*.ts before any production module consumes it.

* fix(browser-session): enforce session existence in BrowserSessionManager.send

Throws when the caller passes a sessionId that doesn't exist or has
been disposed. Still advisory for single-backend Phase 2, but makes
disposeSession() an actual enforcement boundary so commands can't run
against stale ids once Phase 4 adds multi-backend routing.

* feat(chrome-extension): add standalone CDP proxy module (#24112)

* feat(chrome-extension): add standalone CDP proxy module

* fix(chrome-extension): inject runtime.lastError and thread sessionId through CDP proxy

- Add runtime.lastError to ChromeDebuggerApi so mocked tests can surface errors
- Fold frame.sessionId into sendCommand params for flat-session routing
- Extract sessionId from event params when building CdpEventFrame
- Document flat-session handling in the module docstring

* fix(chrome-extension): route flat-session sessionId through DebuggerSession target

Chrome 125+ debugger.sendCommand takes sessionId on the target argument
(DebuggerSession), not inside commandParams. Switch back to passing
sessionId on the target. Same change on the onEvent listener — read
sessionId from 'source' rather than params, since flat-session events
surface it on the source.

Also clean up the module docstring to drop PR-level narrative per
clients/AGENTS.md's comment quality rule.

* fix(chrome-extension): bind defaultChromeDebuggerApi methods to chrome.debugger

Returning methods from a Proxy via Reflect.get without binding causes
'Illegal invocation' at runtime because Chrome's native bindings check
this against the original chrome.debugger object. Replace the Proxy with
a plain object whose methods are explicitly bound.

* feat(chrome-extension-native-host): add native messaging helper scaffold (#24114)

* feat(chrome-extension-native-host): add native messaging helper scaffold

* fix(chrome-extension-native-host): robust port discovery, JSON error handling, and assistant terminology

- Add --assistant-port CLI arg so Chrome-spawned helpers can be pointed
  at a non-default port when the lockfile isn't present
- Surface malformed stdin JSON as a protocol-level error frame instead
  of a silent crash
- Rename user-facing 'daemon' to 'assistant' in error messages per
  AGENTS.md terminology rule

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(chrome-extension-native-host): finish daemon→assistant rename in client prose, vars, and smoke test

- README section header and prose use 'assistant' (per root AGENTS.md §139)
- DEFAULT_DAEMON_PORT → DEFAULT_ASSISTANT_PORT, resolveDaemonPort → resolveAssistantPort (per clients/AGENTS.md §403-404)
- Smoke test example uses dynamic import() instead of require() since the package is ESM

* fix(chrome-extension-native-host): flush stdout before exiting

Wait for process.stdout.write callback to fire before calling
process.exit(), so the native-messaging frame actually reaches Chrome
on pipe-backed stdout before the process terminates. Without this,
Chrome can see a disconnect instead of the intended token_response
or error frame under backpressure or larger payloads.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(chrome-extension): add cloud OAuth sign-in skeleton (#24117)

* feat(chrome-extension): add cloud OAuth sign-in skeleton

* fix(chrome-extension): run OAuth sign-in from service worker and validate guardianId

- Popup now sends a message to the background worker to initiate cloud
  sign-in instead of running launchWebAuthFlow directly. This avoids
  the MV3 popup teardown race where the awaited OAuth promise never
  resolves if the popup blurs during the auth window.
- Add guardianId type check to getStoredToken so malformed stored
  tokens can't leak 'Signed in as guardian:undefined' into the popup UI.

* feat(channels): add chrome-extension interface id and per-capability host proxy gating (#24111)

* feat(channels): add chrome-extension interface id and per-capability host proxy gating

* fix(channels): keep hostBrowserProxy available for non-interactive chrome-extension interfaces

updateClient/drain-queue paths used !isInteractive as a proxy for
hasNoClient, which incorrectly marks the chrome-extension's
hostBrowserProxy unavailable immediately after construction.
Decouple the flags: chrome-extension is non-interactive (no prompter
UI) but still has a connected client for host_browser_request events.

- conversation-routes.ts: derive hasNoClient as !(isInteractive || supportsHostProxy(sourceInterface, 'host_browser'))
- server.ts persistAndProcessMessage: same pattern so queued sends don't lose availability
- conversation-process.ts drain queue: add restore path via new Conversation.restoreBrowserProxyAvailability() helper
- conversation.ts: add restoreBrowserProxyAvailability() that re-enables only the browser proxy (gated on hasNoClient)
- channels/types.ts: clarify supportsHostProxy no-arg JSDoc to call out the desktop-only semantics
- conversation-confirmation-signals.test.ts: cover the new restore helper

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(channels): targeted hostBrowserProxy enable without relaxing hasNoClient

Cycle 1 derived hasNoClient as !(isInteractive || supportsHostProxy(id, 'host_browser')) to
keep the chrome-extension's browser proxy available. That inadvertently made tool gating treat
the conversation as fully interactive (isInteractive derives from !ctx.hasNoClient), enabling
host_bash/host_file tools that chrome-extension can't service.

Revert to the literal hasNoClient = !isInteractive and instead call a targeted
restoreBrowserProxyAvailability() after updateClient. The helper now enables the browser
proxy regardless of hasNoClient so the single-proxy chrome-extension turn works without
leaking host_bash/host_file tool availability.

Part of JARVIS-1175

* fix(channels): drop 'historically' from JSDoc and tighten chrome-extension else-if in server.ts

- assistant/AGENTS.md: comments describe current state, not history
- server.ts: scope the non-interactive host-browser restore branch to interfaces that
  specifically only support host_browser (not macos, which hits the interactive branch)

* test: add restoreBrowserProxyAvailability to Conversation mocks

Two test files use object-literal mocks for Conversation that need the
new method so they don't throw TypeError at the new call site in
handleSendMessage.

* fix(routes): optional-chain restoreBrowserProxyAvailability for test mocks

* test: allowlist chrome-extension-native-host in gateway-only guard

The native messaging helper intentionally POSTs to the local daemon's
/v1/browser-extension-pair endpoint on 127.0.0.1 to mint capability
tokens for the extension; it's a bootstrap path that cannot and should
not go through the gateway. Add it to the guard-test allowlist.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(runtime): route host_browser_request to connected chrome-extension clients (#24129)

* feat(runtime): route host_browser_request to connected chrome-extension clients

* fix(runtime): gateway guardianId plumbing + queue-drain-safe chrome-extension sender

- handleBrowserRelayUpgrade now looks for x-guardian-id header/query param as a
  fallback when the JWT sub is a service token (gateway-forwarded case)
- Conversation exposes hostBrowserSenderOverride so restoreBrowserProxyAvailability
  preserves the registry-routed sender on drain-queue restores instead of clobbering
  it with the SSE hub sender

* feat(chrome-extension): dispatch host_browser_request frames via CDP proxy behind feature flag (#24125)

* feat(chrome-extension): dispatch host_browser_request frames via CDP proxy behind feature flag

* fix(chrome-extension): use camelCase wire format, tolerate re-attach, guard postResult catch

- Match daemon's actual host_browser_request envelope shape (requestId, cdpMethod,
  cdpParams, cdpSessionId — only timeout_seconds stays snake_case)
- POST /v1/host-browser-result with camelCase keys to match the runtime schema
- Track attached CDP targets and skip re-attach; dispose clears the set
- Wrap postResult calls inside the catch handler so a secondary failure is logged
  instead of becoming an unhandled rejection

* fix(chrome-extension): invalidate attachedTargets cache on debugger detach

Subscribe to CdpProxy.onDetach in the dispatcher and remove the
corresponding key from the attached-targets cache when Chrome notifies
us of a detach (tab close, navigation, infobar cancel, external
takeover). Without this, the cache held a stale entry forever and
subsequent commands skipped the re-attach, causing permanent CDP
failures.

* feat(runtime): add /v1/browser-extension-pair capability token endpoint (#24130)

* feat(runtime): add /v1/browser-extension-pair capability token endpoint

* fix(runtime): align pair endpoint with native helper contract + move secret out of workspace

- Accept extensionOrigin (preferred) and origin (legacy) in request body
- Return expiresAt as ISO 8601 string instead of numeric ms, matching what the
  chrome-extension-native-host helper validates
- Move capabilityTokenSecret out of workspace/data into protected storage alongside
  the actor-token-signing-key per AGENTS.md workspace-isolation rule
- Fix isLoopbackHostHeader to correctly parse IPv6 bracket notation

* fix(runtime): align pair allowlist with native helper + reject malformed bracketed Host headers

- ALLOWED_EXTENSION_ORIGINS now matches the chrome-extension-native-host
  placeholder so the dev pair flow works end-to-end
- parseHostHeader rejects inputs like '[::1]attacker.com' where content
  after the closing bracket is not an optional ':port'

* feat(installer): write Chrome native messaging host manifest on macOS install (#24128)

* feat(installer): write Chrome native messaging host manifest on macOS install

* fix(build): parenthesize native-host staleness check

Bash || and && are equal-precedence left-to-right, so the unparenthesized
condition incorrectly required bun.lock to also be newer for a package.json
update to trigger a rebuild. Group the bun.lock subexpression explicitly.

* fix(installer): conform InstallError to LocalizedError so localizedDescription is useful

* feat(chrome-extension): bootstrap self-hosted capability token via native messaging (#24142)

* feat(chrome-extension): bootstrap self-hosted capability token via native messaging

* fix(chrome-extension): nativeMessaging permission, disconnect race, persistence fallback, popup->worker delegation

- Add nativeMessaging permission to manifest so Chrome actually allows
  chrome.runtime.connectNative('com.vellum.daemon')
- Set settled=true synchronously on token_response so a fast onDisconnect
  can't win the race and reject a valid pairing
- On chrome.storage.local.set failure, log and resolve with the in-memory
  token instead of discarding it (single-session fallback)
- Move the pair flow into the service worker via chrome.runtime.sendMessage
  so the popup teardown can't kill the awaited promise mid-flight

* feat(chrome-extension): connect to cloud gateway browser-relay WebSocket (#24143)

* feat(chrome-extension): connect to cloud gateway browser-relay WebSocket

* fix(chrome-extension): surface missing-token connect failures and ignore stale socket close events

- Worker now returns an actionable error when the selected relay mode has
  no usable token (cloud not signed in, self-hosted not paired)
- RelayConnection's close listener ignores events from superseded sockets
  so a setMode mid-flight does not nuke the new socket reference

* test(host-browser): e2e smoke test for self-hosted native-messaging capability bootstrap (#24154)

* test(host-browser): e2e smoke test for cloud-hosted host_browser_request round-trip (#24153)

* test(host-browser): e2e smoke test for cloud-hosted host_browser_request round-trip

* test(host-browser): exercise actual timeout path and clarify mock WS header support

- Disconnected test renamed/restructured to use a never-resolving CDP handler
  plus a short timeout_seconds, so the proxy's setTimeout path is actually
  covered
- Removed/implemented extraHandshakeHeaders on the mock fixture so the
  advertised API matches reality

* test(cdp-proxy): add unit tests and fix sync targetToDebuggee throw (#24187)

* fix(chrome-extension): evict attached-target cache on CDP send failure (#24188)

* test(host-browser-e2e): rewrite header and convert test.skip to test.todo (#24190)

* test(host-bash-proxy): use bun:test fake timers for timeout regression test (#24189)

* fix(chrome-extension): popup pairing reply + relay-aware host_browser result POST (#24194)

* fix(chrome-extension-native-host): halt unauthorized origins and forward guardianId (#24192)

* fix(daemon): gate host tools by per-capability supportsHostProxy (#24195)

* chore(chrome-extension): typecheck worker.ts + popup.ts and use "assistant" terminology (#24199)

* fix(chrome-extension): popup connect handler honors selected relay mode (#24225)

* chore(chrome-extension): extend bun:test ambient shim with common symbols (#24226)

* fix(daemon): preserve host_browser for chrome-extension in per-capability tool gate (#24224)

* fix(chrome-extension): read live relay mode per request + defensive worker cleanups (#24227)

* chore(chrome-extension): remove stale cdp-proxy declarations and outdated comment (#24228)

* chore(chrome-extension-native-host): split writeFrameAndExit + rewrite history-narrating docstrings (#24229)

* chore(chrome-extension): tighten bun:test shim so only test.todo has optional callback (#24234)

* chore(daemon): rewrite host-tool gating test comment in forward-looking voice (#24233)

* chore(chrome-extension): dedupe RelayConnection.mode accessor (keep getCurrentMode) (#24235)

* fix(chrome-extension): worker reads live relay mode from storage on connect (#24236)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant