feat: LinkedIn end-to-end on Owletto Chrome extension (delete Playwright fallback)#1132
Conversation
…intercept Adds an extension-driven network-interception path for LinkedIn-style scrapers, alongside the existing Playwright `browserNetworkSync` stack. Playwright stays as the default; new path is opt-in per connection via `use_extension: true` plus the presence of a chrome dispatcher in `ctx.sessionState`. What ships: - packages/owletto: chrome connector gets three new actions — network_intercept_start/drain/stop — built on chrome.debugger + CDP Network domain. (See submodule commit for details.) - packages/connectors/src/chrome.ts: declares the three new actions in the connector definition (input/output JSON schemas). - packages/connector-sdk/src/extension-network.ts: new `extensionNetworkSync` helper that mirrors `browserNetworkSync`'s shape but routes through a caller-supplied `ChromeActionDispatcher`. 6 unit tests cover the navigate → start → drain → stop sequence, pagination, auth-check bail, non-JSON / binary body handling, and error cleanup. - packages/connectors/src/linkedin.ts: branches on a `use_extension` config flag — when set and a dispatcher is injected, routes through the extension path via `extensionNetworkSync`. Default (no flag) = unchanged Playwright path. Two new private methods `syncUpdatesViaExtension` / `syncJobsViaExtension` reuse the existing parsers and checkpoint logic; only the transport differs. Not yet shipped (called out as follow-ups in PR body): - Server-side bridge that injects `ctx.sessionState.chrome_dispatcher` into a server-side connector run when the connection is paired with an extension worker. Without this, `use_extension: true` falls back to Playwright (graceful no-op). - X (twitter) port — scaffolded path mirrors LinkedIn's; not migrated in this PR. - E2E run against a real LinkedIn company page via the extension. Submodule pointer bumped to the matching owletto branch HEAD.
|
Warning Review limit reached
More reviews will be available in 24 minutes and 36 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThis pull request adds end-to-end support for extension-driven network synchronization: an SDK primitive ChangesExtension Network Sync and Chrome Dispatch Bridge
Sequence DiagramsequenceDiagram
participant Connector
participant extensionNetworkSync
participant ChromeActionDispatcher
participant Gateway
Connector->>extensionNetworkSync: start sync(url, patterns)
extensionNetworkSync->>ChromeActionDispatcher: navigate(url)
extensionNetworkSync->>ChromeActionDispatcher: network_intercept_start(patterns)
loop paginate
extensionNetworkSync->>ChromeActionDispatcher: network_intercept_drain()
ChromeActionDispatcher-->>extensionNetworkSync: intercepted responses
extensionNetworkSync->>ChromeActionDispatcher: evaluate()/triggerNextPage()
end
extensionNetworkSync->>ChromeActionDispatcher: network_intercept_stop(session_id)
extensionNetworkSync->>ChromeActionDispatcher: close_tab(tab_id)
extensionNetworkSync-->>Connector: return items + apiCallCount + backend
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
…ght fallback)
Closes the loop on the LinkedIn extension migration so the connector runs
end-to-end on the paired Owletto Chrome extension. No more dual path.
## What landed
### Server bridge — POST /api/workers/dispatch-chrome-action
Connector-worker fleets call this from inside a running sync to dispatch a
chrome connector action against the paired extension in the same org. The
endpoint:
- authorizes the parent sync run (must be running, claimed by the caller)
- picks an online chrome connection in the parent run's org (browser.debugger
capability + last_seen within 20 min)
- enqueues a device-bound chrome action run via createConnectorOperationRun
- awaits completion via waitForDeviceActionRun (now exported from
manage_operations) — the same Postgres-mediated wait the existing
manage_operations.execute path uses for device-bound calls
Reuses the existing device-action runs queue. No new state machine, no new
auth surface. Multi-replica safe by construction: every signal rides
Postgres rows, the chrome extension's /api/workers/complete-action POST can
land on any replica and finalize the run.
### Connector-worker IPC reverse channel
- ExecutionHooks gains onChromeDispatch.
- Subprocess executor relays chrome_dispatch_request / chrome_dispatch_response
IPC messages between child connector code and the daemon hook (mirrors
the existing auth-signal channel shape).
- child-runner.ts splices a live { dispatch } object onto sessionState before
invoking sync(). The dispatcher rides IPC up to the daemon, the daemon
calls the gateway bridge, the bridge waits for the extension, and the
observation flows back down.
- ExecutorClient.dispatchChromeAction posts to the new gateway endpoint
(trusted worker auth via WORKER_API_TOKEN).
### LinkedIn — DELETE the Playwright fallback
- Removed all browserNetworkSync code paths and the cookie cascade.
- Removed the use_extension config flag (no dual path, no opt-in).
- Removed the browser auth method from the authSchema; the extension carries
auth implicitly from the user's signed-in Chrome.
- sync() now calls requireExtensionDispatcher(ctx) which throws a clear
'no paired Owletto extension' error when none is reachable. No silent
Playwright fallback.
- definition.version bumped 1.1.0 → 2.0.0 (breaking config schema change).
Revolut + X are unchanged — they still use browserNetworkSync. Migrating
them is a separate follow-up; this PR is the LinkedIn cutover.
## Verification
- bun run typecheck — clean (server + owletto strict).
- make build-packages — clean.
- bun test packages/connector-worker — 60 pass (existing IPC test surface
green; the new IPC channel is a parallel branch).
- bun test packages/connector-sdk/src/__tests__/extension-network.test.ts —
6 pass (existing extensionNetworkSync contract unchanged).
- bun test packages/server/src/__tests__/unit/connectors/linkedin.test.ts —
2 pass (existing checkpoint-filter logic unchanged).
- bun test packages/server/src/__tests__/unit — 240 pass / 16 skip / 0 fail.
## E2E — NOT VALIDATED in this dispatch (dead-end called per AGENTS.md hard gate)
The user's live Owletto extension is paired against PROD (app.lobu.ai) in
the buremba org. To E2E this branch we'd need to either:
(a) deploy the branch image to prod and run a real LinkedIn sync there
(Flux auto-deploys main; this branch hasn't merged yet), OR
(b) re-pair the user's extension to a local make-dev gateway, apply the
chrome 2.x + linkedin 2.x connector definitions to a fresh local
org, create a linkedin connection, drive a sync, and verify Voyager
responses land. That's a multi-step re-pair + auth + sync session
that wasn't safe to do from this dispatch without explicit
'go ahead and re-pair' from the user (overwrites their current
buremba pairing).
Per AGENTS.md 'E2E before merge (hard gate) … if you can't reproduce,
BAIL': bailing on the live LinkedIn proof. Next-step options spelled out in
the agent's report.
The buremba prod chrome connector_definition is still at version 0.2.0
(network_intercept_* actions not yet present). After this lands and
deploys, applying chrome 0.3.0+ to buremba is the first E2E step.
## Diff stat (lobu)
packages/connector-worker/src/daemon/client.ts | +46
packages/connector-worker/src/daemon/executor.ts | +13
packages/connector-worker/src/executor/child-runner.ts| +65
packages/connector-worker/src/executor/interface.ts | +13
packages/connector-worker/src/executor/subprocess.ts | +47
packages/connectors/src/linkedin.ts | +49 / -226 (net -177)
packages/server/src/index.ts | +4
packages/server/src/tools/admin/manage_operations.ts | +1 / -1 (export)
packages/server/src/worker-api/dispatch-chrome-action.ts | new (~170)
|
bug_free 82, simplicity 76, slop 12, bugs 0, 0 blockers Typecheck, unit, and integration logs all had exit 0. Booted server bundle with a temp file:// DATABASE_URL and hit /health -> 200. Did not run a real Owletto/Chrome LinkedIn e2e, so the end-to-end browser bridge remains the main unverified path. Suggested fixes
Full verdict JSON{
"bug_free_confidence": 82,
"bugs": 0,
"slop": 12,
"simplicity": 76,
"blockers": [],
"change_type": "feat",
"behavior_change_risk": "high",
"tests_adequate": true,
"suggested_fixes": [
{
"file": "packages/connector-worker/src/executor/child-runner.ts",
"line": 69,
"change": "Update the timeout comment to match waitForDeviceActionRun's actual POST_CLAIM_BUDGET_MS: 95s, so the bridge total is 155s plus buffer; remove the review-meta sentence."
}
],
"notes": "Typecheck, unit, and integration logs all had exit 0. Booted server bundle with a temp file:// DATABASE_URL and hit /health -> 200. Did not run a real Owletto/Chrome LinkedIn e2e, so the end-to-end browser bridge remains the main unverified path.",
"categories": {
"src": 1006,
"tests": 255,
"docs": 0,
"config": 0,
"deps": 2,
"migrations": 0,
"ci": 0,
"generated": 0
}
}Local review gate — branch protection can require the |
- Reorder extensionNetworkSync to start interception BEFORE navigating to the target URL. Open a scratch tab at about:blank → start CDP Network listener → navigate to opts.url. Previously navigate ran first, so initial Voyager / GraphQL XHRs that landed during page render completed before start() attached and were silently lost (pi blocker #1 on PR #1132). - Plumb opts.config.allowedOrigins through every dispatched chrome action's action_input.allowed_origins. The chrome extension's per-run ctx (apps/chrome/background.js) pulls allowed_origins off run.config or action_input and enforces it in tools.js / network-intercept.js. Without this the dispatched runs landed with an empty allowlist (permissive), defeating the connector author's origin gate (pi blocker #2). - LinkedIn now sets allowedOrigins: ['linkedin.com', '*.linkedin.com'] on both feeds. - Add a 240s hard ceiling to child-runner pendingDispatchWaiters and reject on IPC send failure so a wedged daemon / disconnected IPC channel can't leave sync() hanging indefinitely (pi suggested follow-up). Matches the gateway-side QUEUE_BUDGET_MS + POST_CLAIM_BUDGET_MS (60s + 120s) plus buffer. Verification: - bun test packages/connector-sdk/src/__tests__/extension-network.test.ts: 6 pass / 0 fail. - bun test packages/connector-worker: 60 pass / 0 fail. - make typecheck: clean. The 6 existing extension-network tests still describe the about:blank → start → navigate flow correctly because the stub dispatcher returns the same current_url on every navigate call (regardless of which URL was passed).
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/connector-sdk/src/extension-network.ts`:
- Around line 184-209: The scratch tab opened via blankNavObs
(opts.dispatcher.dispatch('navigate', ...)) can leak if the subsequent
network_intercept_start dispatch fails because the existing finally cleanup only
runs inside the try block; fix by expanding the try/finally scope to cover both
the tab open and the intercept start (or add an outer try with a finally) so
that any error thrown by the network_intercept_start dispatch still triggers the
existing cleanup that closes the tab (use opts.dispatcher.dispatch to close the
tab by tabId or reuse the same cleanup logic that references tabId/sessionId).
Ensure you still capture startObs.session_id into sessionId only after a
successful start and preserve existing logging (sdkLogger.info) around
blankNavObs and startObs.
In `@packages/connector-worker/src/executor/subprocess.ts`:
- Around line 329-374: The handler for chrome_dispatch_request currently trusts
msg.requestId blindly; add a guard at the top of the chrome_dispatch_request
branch to validate requestId (e.g., const requestId = msg.requestId; if (typeof
requestId !== 'number' || !Number.isFinite(requestId)) { try { child.send({
type: 'chrome_dispatch_response', requestId: requestId ?? null, error: 'invalid
requestId' }); } catch {/* ignore */} return; }) so malformed IDs are rejected
immediately (no queueTask) and the child gets an immediate
chrome_dispatch_response error; keep the rest of the logic using queueTask,
hooks.onChromeDispatch and child.send unchanged.
In `@packages/server/src/worker-api/dispatch-chrome-action.ts`:
- Around line 52-61: Validate request shape more strictly in the dispatch
handler: ensure body.parent_run_id is a positive integer (use
Number.isInteger(...) && body.parent_run_id > 0) and reject otherwise with
c.json(..., 400); ensure body.action_input is a non-null plain object (typeof
body.action_input === 'object' && body.action_input !== null &&
!Array.isArray(body.action_input)) and reject if not; keep the existing checks
for body.worker_id and body.action_key but add these new validations around
body.parent_run_id and body.action_input to avoid enqueueing malformed runs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 4f6740c6-d6e2-4f97-bf0d-b740d8a7528a
📒 Files selected for processing (14)
packages/connector-sdk/src/__tests__/extension-network.test.tspackages/connector-sdk/src/extension-network.tspackages/connector-sdk/src/index.tspackages/connector-worker/src/daemon/client.tspackages/connector-worker/src/daemon/executor.tspackages/connector-worker/src/executor/child-runner.tspackages/connector-worker/src/executor/interface.tspackages/connector-worker/src/executor/subprocess.tspackages/connectors/src/chrome.tspackages/connectors/src/linkedin.tspackages/owlettopackages/server/src/index.tspackages/server/src/tools/admin/manage_operations.tspackages/server/src/worker-api/dispatch-chrome-action.ts
| if (msg.type === 'chrome_dispatch_request') { | ||
| const requestId = msg.requestId; | ||
| const actionKey = typeof msg.actionKey === 'string' ? msg.actionKey : ''; | ||
| const actionInput = | ||
| msg.actionInput && typeof msg.actionInput === 'object' | ||
| ? (msg.actionInput as Record<string, unknown>) | ||
| : {}; | ||
| queueTask(async () => { | ||
| if (!hooks?.onChromeDispatch) { | ||
| try { | ||
| child.send({ | ||
| type: 'chrome_dispatch_response', | ||
| requestId, | ||
| error: | ||
| 'chrome_dispatcher is not available in this execution context (no onChromeDispatch hook)', | ||
| }); | ||
| } catch { | ||
| /* ignore */ | ||
| } | ||
| return; | ||
| } | ||
| try { | ||
| const output = await hooks.onChromeDispatch(actionKey, actionInput); | ||
| try { | ||
| child.send({ | ||
| type: 'chrome_dispatch_response', | ||
| requestId, | ||
| output, | ||
| }); | ||
| } catch { | ||
| /* IPC closed — child already exited. */ | ||
| } | ||
| } catch (err) { | ||
| try { | ||
| child.send({ | ||
| type: 'chrome_dispatch_response', | ||
| requestId, | ||
| error: err instanceof Error ? err.message : String(err), | ||
| }); | ||
| } catch { | ||
| /* IPC closed — child already exited. */ | ||
| } | ||
| } | ||
| }); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Validate requestId before queuing chrome dispatch work.
Line 330 accepts any msg.requestId. A malformed value can break response correlation and force the child to wait until its hard timeout. Add a numeric guard at this trust boundary and return early (or send an immediate error when possible).
🔧 Suggested fix
if (msg.type === 'chrome_dispatch_request') {
- const requestId = msg.requestId;
+ const requestId = msg.requestId;
+ if (!Number.isInteger(requestId)) {
+ return;
+ }
const actionKey = typeof msg.actionKey === 'string' ? msg.actionKey : '';
const actionInput =
msg.actionInput && typeof msg.actionInput === 'object'
? (msg.actionInput as Record<string, unknown>)
: {};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (msg.type === 'chrome_dispatch_request') { | |
| const requestId = msg.requestId; | |
| const actionKey = typeof msg.actionKey === 'string' ? msg.actionKey : ''; | |
| const actionInput = | |
| msg.actionInput && typeof msg.actionInput === 'object' | |
| ? (msg.actionInput as Record<string, unknown>) | |
| : {}; | |
| queueTask(async () => { | |
| if (!hooks?.onChromeDispatch) { | |
| try { | |
| child.send({ | |
| type: 'chrome_dispatch_response', | |
| requestId, | |
| error: | |
| 'chrome_dispatcher is not available in this execution context (no onChromeDispatch hook)', | |
| }); | |
| } catch { | |
| /* ignore */ | |
| } | |
| return; | |
| } | |
| try { | |
| const output = await hooks.onChromeDispatch(actionKey, actionInput); | |
| try { | |
| child.send({ | |
| type: 'chrome_dispatch_response', | |
| requestId, | |
| output, | |
| }); | |
| } catch { | |
| /* IPC closed — child already exited. */ | |
| } | |
| } catch (err) { | |
| try { | |
| child.send({ | |
| type: 'chrome_dispatch_response', | |
| requestId, | |
| error: err instanceof Error ? err.message : String(err), | |
| }); | |
| } catch { | |
| /* IPC closed — child already exited. */ | |
| } | |
| } | |
| }); | |
| return; | |
| } | |
| if (msg.type === 'chrome_dispatch_request') { | |
| const requestId = msg.requestId; | |
| if (!Number.isInteger(requestId)) { | |
| return; | |
| } | |
| const actionKey = typeof msg.actionKey === 'string' ? msg.actionKey : ''; | |
| const actionInput = | |
| msg.actionInput && typeof msg.actionInput === 'object' | |
| ? (msg.actionInput as Record<string, unknown>) | |
| : {}; | |
| queueTask(async () => { | |
| if (!hooks?.onChromeDispatch) { | |
| try { | |
| child.send({ | |
| type: 'chrome_dispatch_response', | |
| requestId, | |
| error: | |
| 'chrome_dispatcher is not available in this execution context (no onChromeDispatch hook)', | |
| }); | |
| } catch { | |
| /* ignore */ | |
| } | |
| return; | |
| } | |
| try { | |
| const output = await hooks.onChromeDispatch(actionKey, actionInput); | |
| try { | |
| child.send({ | |
| type: 'chrome_dispatch_response', | |
| requestId, | |
| output, | |
| }); | |
| } catch { | |
| /* IPC closed — child already exited. */ | |
| } | |
| } catch (err) { | |
| try { | |
| child.send({ | |
| type: 'chrome_dispatch_response', | |
| requestId, | |
| error: err instanceof Error ? err.message : String(err), | |
| }); | |
| } catch { | |
| /* IPC closed — child already exited. */ | |
| } | |
| } | |
| }); | |
| return; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/connector-worker/src/executor/subprocess.ts` around lines 329 - 374,
The handler for chrome_dispatch_request currently trusts msg.requestId blindly;
add a guard at the top of the chrome_dispatch_request branch to validate
requestId (e.g., const requestId = msg.requestId; if (typeof requestId !==
'number' || !Number.isFinite(requestId)) { try { child.send({ type:
'chrome_dispatch_response', requestId: requestId ?? null, error: 'invalid
requestId' }); } catch {/* ignore */} return; }) so malformed IDs are rejected
immediately (no queueTask) and the child gets an immediate
chrome_dispatch_response error; keep the rest of the logic using queueTask,
hooks.onChromeDispatch and child.send unchanged.
| if (typeof body.parent_run_id !== 'number' || !body.parent_run_id) { | ||
| return c.json({ error: 'parent_run_id is required' }, 400); | ||
| } | ||
| if (!body.worker_id?.trim()) { | ||
| return c.json({ error: 'worker_id is required' }, 400); | ||
| } | ||
| if (!body.action_key?.trim()) { | ||
| return c.json({ error: 'action_key is required' }, 400); | ||
| } | ||
|
|
There was a problem hiding this comment.
Harden request-shape validation at the API boundary.
Line 52 and Line 61 validate required fields, but parent_run_id can still be non-integer/negative and action_input can be a non-object value. Rejecting malformed payloads here avoids enqueueing runs that fail later with harder-to-debug errors.
🔧 Suggested patch
- if (typeof body.parent_run_id !== 'number' || !body.parent_run_id) {
+ if (
+ typeof body.parent_run_id !== 'number' ||
+ !Number.isInteger(body.parent_run_id) ||
+ body.parent_run_id <= 0
+ ) {
return c.json({ error: 'parent_run_id is required' }, 400);
}
@@
if (!body.action_key?.trim()) {
return c.json({ error: 'action_key is required' }, 400);
}
+ if (
+ body.action_input !== undefined &&
+ (typeof body.action_input !== 'object' ||
+ body.action_input === null ||
+ Array.isArray(body.action_input))
+ ) {
+ return c.json({ error: 'action_input must be an object' }, 400);
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (typeof body.parent_run_id !== 'number' || !body.parent_run_id) { | |
| return c.json({ error: 'parent_run_id is required' }, 400); | |
| } | |
| if (!body.worker_id?.trim()) { | |
| return c.json({ error: 'worker_id is required' }, 400); | |
| } | |
| if (!body.action_key?.trim()) { | |
| return c.json({ error: 'action_key is required' }, 400); | |
| } | |
| if ( | |
| typeof body.parent_run_id !== 'number' || | |
| !Number.isInteger(body.parent_run_id) || | |
| body.parent_run_id <= 0 | |
| ) { | |
| return c.json({ error: 'parent_run_id is required' }, 400); | |
| } | |
| if (!body.worker_id?.trim()) { | |
| return c.json({ error: 'worker_id is required' }, 400); | |
| } | |
| if (!body.action_key?.trim()) { | |
| return c.json({ error: 'action_key is required' }, 400); | |
| } | |
| if ( | |
| body.action_input !== undefined && | |
| (typeof body.action_input !== 'object' || | |
| body.action_input === null || | |
| Array.isArray(body.action_input)) | |
| ) { | |
| return c.json({ error: 'action_input must be an object' }, 400); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/server/src/worker-api/dispatch-chrome-action.ts` around lines 52 -
61, Validate request shape more strictly in the dispatch handler: ensure
body.parent_run_id is a positive integer (use Number.isInteger(...) &&
body.parent_run_id > 0) and reject otherwise with c.json(..., 400); ensure
body.action_input is a non-null plain object (typeof body.action_input ===
'object' && body.action_input !== null && !Array.isArray(body.action_input)) and
reject if not; keep the existing checks for body.worker_id and body.action_key
but add these new validations around body.parent_run_id and body.action_input to
avoid enqueueing malformed runs.
… blocker on #1132) owletto 7d3b5ed4 fixes the pi v2 blocker on this PR: 'LinkedIn extension sync is blocked before navigation: extensionNetworkSync passes allowed_origins while opening/starting interception on about:blank, but Owletto rejects about:blank as outside the allowlist.' Carve-out about:blank as always allowed in: - apps/chrome/tools.js enforceAllowedOrigin() (covers tool_navigate + enforceAllowedOriginFromTab via delegation) - apps/chrome/network-intercept.js enforceTabUrlAgainstAllowedOrigins() (covers tool_network_intercept_start on the scratch tab) Per-response listener inside start() still enforces allowedOrigins on every captured response URL, so the carve-out is bare-tab-attach only. owletto regression test: apps/chrome/network-intercept.test.js 'allows start on about:blank with allowedOrigins set, then filters captures on the real URL' → drives start-on-blank → fires CDP events → asserts only the allowed-host response made it through. 19 network-intercept tests pass / 0 fail (was 18 / 0).
…er start cleanup Pi v3 review (verdict bug_free 30, 1 blocker): 'network_intercept.js:199 attaches CDP without updating tools.js debugger ownership; extensionNetworkSync's next navigate action re-attaches the same tab via withDebugger and will fail before LinkedIn sync can load the real page.' + 'Wrap network_intercept_start in the cleanup try/finally too: initialize sessionId to null before start, conditionally stop when set, and always close the opened tab if start throws.' Both addressed: owletto (submodule bump → 6ab0f54): introduce a shared refcounted debugger-lease module (apps/chrome/debugger-lease.js) that withDebugger (action tools) and tool_network_intercept_start/stop both hold leases on. Physical chrome.debugger.attach happens on the 0→1 transition; detach on 1→0. Overlapping owners just bump the count, so extensionNetworkSync's navigate → start → navigate(real URL) flow no longer hits 'Another debugger is already attached'. +2 regression tests on the lease interplay (51 owletto tests pass, was 49). connector-sdk (extension-network.ts): restructure the start+navigate+drain into a single try/finally where sessionId starts null and is set on successful start(). safeStop(null) is a no-op (existing), and safeCloseTab always runs in the finally — so a thrown network_intercept_start now closes the about:blank scratch tab instead of leaking it. 6 sdk tests still pass. Verification: - bun test packages/owletto/apps/chrome → 51 pass / 0 fail - bun test packages/connector-sdk/src/__tests__/extension-network.test.ts → 6 / 0 - make typecheck → clean
|
Actionable comments posted: 0 |
- Replace stale 'Non-goal' header that referenced a per-connector use_extension config flag — that flag is gone (LinkedIn is fully on the extension path; Revolut + X still use browserNetworkSync, no flag needed). Reword to describe the actual migration scope. - Drop 'Pi review caught this' meta from the navigate-before-start comment; keep only the technical rationale. Pi v4 verdict was already bug_free 78 / 0 blockers; these were the two non-blocking suggested_fixes.
…sion-stack # Conflicts: # packages/owletto
Summary
End-to-end migration of the LinkedIn connector off the Playwright/CDP stack onto the paired Owletto Chrome extension's
network_intercept_*primitive. No more dual path — when no online paired extension is reachable in the connection's org, LinkedIn fails fast withno paired Owletto extension, never silently falling back to a separate browser launch.Pi review state
881800b77235e53e7235e53e65e6bd45(owletto7d3b5ed)65e6bd45fc168609(owletto6ab0f54) — refcounted leasefc168609314c13fe.The pi v4 verdict is on PR #1132 comment id
4566437128(latest).Architecture (end state)
Cross-process state is Postgres-only (the runs queue). The dispatcher is multi-replica safe by construction: the chrome extension's
/api/workers/complete-actionPOST can land on any replica and finalize the run row.What's in this PR
Server bridge
POST /api/workers/dispatch-chrome-action(packages/server/src/worker-api/dispatch-chrome-action.ts): authorize the parent sync run → pick an online chrome connection in the same org (browser.debuggercapability +last_seen_atwithin 20 min) → enqueue an action run viacreateConnectorOperationRun(same helpermanage_operations.executeuses for device-bound calls) → await completion via the sharedwaitForDeviceActionRun→ return observation.waitForDeviceActionRunexported frommanage_operations.tsfor reuse.Connector-worker IPC reverse channel
ExecutionHooks.onChromeDispatch— new hook that forwards calls from the child connector subprocess to the gateway bridge.child-runner.tssplices a live{ dispatch }object ontosessionState.chrome_dispatcherbefore invokingsync(). Mirrors the existingawait_signal_requestreverse channel.ExecutorClient.dispatchChromeActionposts to the new gateway endpoint (trusted worker auth viaWORKER_API_TOKEN).extensionNetworkSync — safe ordering
navigate(about:blank) → start → navigate(real URL)so the CDP Network listener attaches BEFORE the page emits initial-render XHRs (otherwise Voyager batch is lost).allowedOriginsconfig option threaded into every dispatched action'saction_input.allowed_origins.sessionId: string | null;safeStopno-ops on null,safeCloseTabalways runs.Owletto chrome extension (submodule)
apps/chrome/network-intercept.js— the primitive itself (18 base + 1 about:blank-allowedOrigins regression + 2 lease-lifecycle tests).apps/chrome/debugger-lease.js— new module with refcountedacquireDebuggerLease/releaseDebuggerLease. Bothtools.js withDebuggerandnetwork-intercept.js start/stophold leases on the sametabId; physicalchrome.debugger.attachhappens only on 0→1 transition, physical detach only on 1→0. Lets the long-lived intercept session coexist with action tools' per-call attach/detach pattern.apps/chrome/tools.js—enforceAllowedOrigincarves outabout:blank(scratch tab setup URL is always allowed).withDebuggernow goes through the shared lease.apps/chrome/network-intercept.js—enforceTabUrlAgainstAllowedOriginscarves outabout:blanksimilarly.tool_network_intercept_startacquires the lease instead of callingchrome.debugger.attachdirectly;tool_network_intercept_stopreleases it.LinkedIn — DELETE the Playwright fallback
browserNetworkSyncconsumption + the cookie cascade.use_extensionconfig flag (no dual path, no opt-in).authSchema; auth lives implicitly in the user's signed-in Chrome.sync()callsrequireExtensionDispatcher(ctx)which throws cleanly when no dispatcher is present.allowedOrigins: ['linkedin.com', '*.linkedin.com']set on both feeds.definition.versionbumped 1.1.0 → 2.0.0 (breaking config schema change —use_extensionremoved).Out of scope (intentional)
browserNetworkSync. The shared helper stays alive for them; only LinkedIn's call into it was removed. Migrating Revolut + X is the obvious follow-up; once both are off it, dropbrowserNetworkSyncentirely.Verification
bun run typecheckmake typecheck(strict, server + owletto)make build-packagesbun test packages/connector-workerbun test packages/connector-sdk/src/__tests__/extension-network.test.tsbun test packages/server/src/__tests__/unit/connectors/linkedin.test.tsbun test packages/server/src/__tests__/unitbun test packages/owletto/apps/chrome/network-intercept.test.js packages/owletto/apps/chrome/tools.test.jsmake review BASE=origin/main(pi v4 onfc168609)Live LinkedIn E2E — NOT VALIDATED in this PR
Per AGENTS.md "E2E before merge (hard gate) … if you can't reproduce, BAIL", this is called out clearly rather than faking the proof.
Two ground truths from prod DB (buremba):
buremba.connector_definitions.chromeis at version 0.2.0 —network_intercept_start/drain/stoparen't in the catalog yet.buremba.connections WHERE connector_key='linkedin'→ 0 rows.So the live E2E needs either:
mainafter squash-merge; once chrome 0.3.0 (the new actions) and linkedin 2.0.0 reach prod,lobu applythem to buremba, create a LinkedIn connection on a real company page, trigger a sync, and verify Voyager events land inevents. The user's live Owletto extension is already paired in buremba so the dispatch path lights up immediately.make dev(~/.config/lobu-dev/chromeprofile — separate from the user's prod-paired one, no disruption), then drive the full pipeline locally. Plausible but multi-hour integration work and was deferred in favour of getting the code into pi-green state for the merge gate.Path (1) is the lower-effort smoke for the live LinkedIn case. The pi v4 verdict already clears the static-correctness bar.
Diff stat (lobu)
Owletto submodule: