fix(server): server-side agent must skip device-pinned watcher runs by buremba · Pull Request #808 · lobu-ai/lobu

buremba · 2026-05-17T04:01:59Z

Summary

Closes #802. Prereq for #798.

claimWatcherRun in packages/server/src/watchers/automation.ts previously grabbed every pending run_type='watcher' row, ignoring whether the run was pinned to a specific device worker. Once the dispatcher in #798 starts setting approved_input.device_worker_id, the user's Mac (polling /api/workers/poll, which already honors device pinning) will race the in-process server-side agent for the same row.

This is the exact failure mode behind the historical "owletto-worker daemon claims run_type='watcher' and silently marks success" bug — different surface, same shape (two claimants on one row, one of them eats it without doing the work).

Change

Add a JSONB-level guard inside the WITH next_run AS (...) CTE so the server-side dispatcher's FOR UPDATE SKIP LOCKED selection ignores any run whose approved_input->>'device_worker_id' is non-null and non-empty. The pin lives in approved_input JSONB for now; issue #799 will add a proper column, at which point this guard plus a column predicate is a single-line follow-up.

No schema change. /api/workers/poll is untouched (already has device pinning).

Test plan

New test in automation-contract.test.ts: a watcher run with approved_input.device_worker_id = 'mac-device-abc' stays in pending (with claimed_by/claimed_at still NULL) after both the global dispatchPendingWatcherRuns({} as Env, { db }) scan AND the explicit runIds: [queued.runId] branch.
make typecheck — clean for my files; the only failure is the pre-existing @electric-sql/pglite-postgis module-not-found in pglite-backend.ts, which exists on origin/main unchanged.
Integration test run gated on Postgres in CI.

Summary by CodeRabbit

Bug Fixes
- Improved watcher run scheduling to prevent race conditions when runs are assigned to device workers, ensuring proper coordination between dispatcher and device worker assignments.
Tests
- Added test coverage for scheduled watcher runs pinned to device workers.

The server-side watcher claim query (claimWatcherRun in packages/server/src/watchers/automation.ts) previously grabbed every pending run_type='watcher' row regardless of pinning. Once #798's dispatcher starts setting approved_input.device_worker_id, the user's Mac (or other device worker polling /api/workers/poll) will race the server-side agent for the same row — the exact failure mode that masked the watcher-run silent-success bug in the past. Add a JSONB-level guard to the CTE so rows with a non-empty device_worker_id are invisible to the server-side claim path. The pin currently lives in approved_input (issue #799 will add a proper column); the guard handles both NULL and empty-string explicitly so the existing test fixtures (no device_worker_id) keep matching. Test: a watcher run with approved_input.device_worker_id set stays pending after dispatchPendingWatcherRuns runs, both the global-scan and explicit-runIds branches. Closes #802 Prereq for #798

coderabbitai · 2026-05-17T04:02:10Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 739aad0b-d880-4c1f-8889-9c9509c246a6

📥 Commits

Reviewing files that changed from the base of the PR and between 32920c8 and b6d55c1.

📒 Files selected for processing (2)

packages/server/src/__tests__/integration/watchers/automation-contract.test.ts
packages/server/src/watchers/automation.ts

📝 Walkthrough

Walkthrough

The PR implements server-side filtering to prevent the dispatcher from claiming watcher runs pinned to a device worker. The claimWatcherRun query now excludes pending runs where approved_input.device_worker_id is set, and an integration test verifies both default and explicit dispatch paths skip such runs.

Changes

Device-pinned watcher run exclusion filter

Layer / File(s)	Summary
Device-pinned run exclusion filter and validation `packages/server/src/watchers/automation.ts`, `packages/server/src/__tests__/integration/watchers/automation-contract.test.ts`	`claimWatcherRun` SQL query filters out pending runs with `approved_input.device_worker_id` set (not NULL, not empty); integration test covers both default and explicit `runIds` dispatch paths, asserting claimed/dispatched counts remain 0 and run status/provenance stay unchanged.

Estimated Code Review Effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly Related Issues

#798 — Device-pinning epic for watchers; this PR implements the required server-side exclusion filter (WHERE clause on approved_input.device_worker_id) that prevents the server dispatcher from racing device workers for the same run.

Poem

A rabbit hops through the watcher flow,
Checking if a device claims the show,
"Skip this one!" the filter does say,
Let the worker have its way. 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly and concisely describes the main change: adding a guard to skip device-pinned watcher runs in the server-side agent.
Description check	✅ Passed	The PR description includes all required sections: Summary (with linked issues), detailed Change explanation, and Test plan with checkmarks showing completion status.
Linked Issues check	✅ Passed	The changes fully address issue `#802`'s requirement: a JSONB-level guard was added to skip runs where approved_input->>'device_worker_id' is non-null/non-empty, and an integration test verifies this behavior for both dispatch paths.
Out of Scope Changes check	✅ Passed	All changes are in scope: the automation.ts fix implements the required device-pinning filter, and the test file validates the exact behavior specified in issue `#802`.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/server-skip-device-pinned-watchers

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov-commenter · 2026-05-17T04:03:27Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b6d55c1753

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-17T04:03:36Z

+            r.approved_input->>'device_worker_id' IS NULL
+            OR r.approved_input->>'device_worker_id' = ''


Keep a claimant for device-pinned watcher runs

When a watcher run has approved_input.device_worker_id, this predicate makes dispatchPendingWatcherRuns skip it, but the worker polling path does not currently claim watcher runs: in packages/server/src/worker-api.ts the /api/workers/poll SQL allow-list is r.run_type IN ('sync', 'action', 'embed_backfill', 'auth'). In the device-pinned watcher scenario described by this change, the row therefore remains pending indefinitely instead of being picked up by the device worker; add the watcher lane to the device claim path (with the same pin/scope checks) before excluding it here.

Useful? React with 👍 / 👎.

* feat(watchers): device-pinned watcher runs end-to-end (#798 PR-1) The schema work in #811 added watchers.device_worker_id + agent_kind + notification + cooldown columns, and #808 made the server-side dispatcher skip runs already pinned to a device. This PR wires the pinning + claim + completion sides so a user's Lobu Mac app can actually execute the run via a local CLI agent (Claude Code, etc.). Server-side: 1. `materializeDueWatcherRuns` now reads `watchers.device_worker_id` and `watchers.agent_kind` and persists both into the run's `approved_input` JSONB. The existing #802 dispatcher exclusion keys off `approved_input->>'device_worker_id'`, so device-pinned rows stay `pending` on the server side. 2. `/api/workers/poll` gains a parallel CTE branch for `run_type='watcher'` AND `approved_input->>'device_worker_id' = <this device's id>`. The poll response is short-circuited for watchers — no connector code, no credentials, no compiled_code lookup. It returns a watcher payload envelope `{ watcher, event, context }` that the device-side dispatcher uses to build a CLI prompt. 3. New endpoint `POST /api/workers/me/runs/:runId/complete-watcher`: authorizes via the existing claim-ownership gate (`authorizeRunForWorker`), writes a `watcher_windows` row with `model_used='device-cli'` and `extracted_data.kind='device_cli_output'`, updates `runs.status` to completed/failed, advances `watchers.last_fired_at`, and emits a `watcher:updated` lifecycle event so dashboard metric_series picks up the run. 4. The new path is whitelisted in the user-scoped worker route gate (was previously 403 for `/api/workers/me/...` outside auth-profiles/feeds). Tests: - materializeDueWatcherRuns persists device_worker_id + agent_kind. - End-to-end complete-watcher → runs.completed, watcher_windows row with the CLI output, last_fired_at advanced. - Failure path: error supplied → runs.failed, no window row. - 409 for non-watcher run types, 404 for unknown run ids. PR-2 (Owletto Mac app dispatcher + ClaudeCodeExecutor) consumes this contract. * fix(watchers): close pi review on device-side complete-watcher (#814) 5 issues from pi's review of #814's WatcherDispatcher / completeWatcherRun that all materialise on real device traffic: 1. (BLOCKER) Schedule advancement mismatch — completing a device watcher moved `last_fired_at` forward but never bumped `next_run_at`, so the scheduler tick re-materialised the same watcher every minute forever. Extract `advanceWatcherSchedule(sql|tx, watcherId)` from automation.ts (was `advanceWatcherScheduleAfterTerminalFailure`) and call it from both manage_watchers(action="complete_window") and the device complete-watcher endpoint after the in-transaction completion writes. 2. Device-identity binding — `claimed_by === body.worker_id` alone is spoofable across devices that share a user OAuth token: any other device with the token could claim a run under any worker_id, then a third caller with the same token could complete it. For user-scoped workers we now resolve the caller's `device_workers.id` from `(workerUserId, body.worker_id)` and require it to equal `approved_input.device_worker_id` (which materializeDueWatcherRuns already snapshotted from the watcher pin). Mismatch → 403. 3. Completion race — two concurrent POSTs both passed the unlocked `authorizeRunForWorker` status read, both opened a tx, both INSERTed a watcher_windows row; the loser's run-UPDATE saw 0 rows and the tx committed a phantom window. Now lock `SELECT ... FOR UPDATE` on the run row at the top of the tx; if status is no longer 'running', return 200 idempotently with `{idempotent: true}`. 4. Window-id allocation — drop the inline `COALESCE(MAX(id), 0) + 1` in favour of the codebase's shared `getNextNumericId(tx, 'watcher_windows')` helper (whitelisted, same pattern used by manage_watchers). 5. Malformed payload no longer stranded the run — validation that throws inside the tx rolls back to `running`, and there's no stale-run sweep today. Validate `approved_input.window_start` / `window_end` BEFORE the transaction; on bad input mark the run `failed` (so next_run_at advances normally) and return 400. Tests: three integration tests covering #1, #3, and #5. The user-scoped spoof test (#2) needs OAuth device-flow setup that isn't wired up in the current test fixtures — leaving it for a follow-up that lands the helper alongside other user-scoped worker tests. * fix(watchers): pi round-2 — bind workerId from auth, race-free id alloc, no double-advance A) Bound-worker check now reads `c.var.mcpAuthInfo?.workerId` (PAT/OAuth token binding), not body.worker_id. Same-user attacker can't complete as another registered worker by lying in the payload. B) `getNextNumericId` takes a per-table `pg_advisory_xact_lock` (`hashtext('<table>_id_alloc')`). Inside `completeWatcherRun`'s tx the lock is held until commit, so two concurrent completions on different watcher runs serialize on allocation instead of racing on MAX(id)+1. C) Validation-failure path only advances the schedule when the `UPDATE … WHERE status='running'` actually matched a row (RETURNING-gated). Two concurrent malformed POSTs no longer double-tick `next_run_at`. Tests: - device spoof (Fix A): token bound to worker A, run pinned to worker B, body posts worker-B — asserts 403 and run stays running. - concurrent allocation (Fix B): two completions on different watchers fired in parallel — both 200, distinct watcher_windows.id. - double-advance (Fix C): two malformed POSTs against the same run — next_run_at advances only once.

chatgpt-codex-connector Bot reviewed May 17, 2026

View reviewed changes

buremba merged commit a88d840 into main May 17, 2026
23 of 24 checks passed

buremba deleted the fix/server-skip-device-pinned-watchers branch May 17, 2026 04:05

buremba mentioned this pull request May 17, 2026

chore(main): release lobu 7.1.0 #724

Merged

coderabbitai Bot mentioned this pull request May 17, 2026

feat(schema): watchers — device_worker_id, agent_kind, notification, cooldown columns #811

Merged

buremba mentioned this pull request May 17, 2026

feat(watchers): device-pinned watcher runs end-to-end (#798 PR-1) #814

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(server): server-side agent must skip device-pinned watcher runs#808

fix(server): server-side agent must skip device-pinned watcher runs#808
buremba merged 1 commit into
mainfrom
fix/server-skip-device-pinned-watchers

buremba commented May 17, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 17, 2026 •

edited

Loading

Walkthrough

Changes

Estimated Code Review Effort

Possibly Related Issues

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov-commenter commented May 17, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		r.approved_input->>'device_worker_id' IS NULL
		OR r.approved_input->>'device_worker_id' = ''

Conversation

buremba commented May 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated Code Review Effort

Possibly Related Issues

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov-commenter commented May 17, 2026

Codecov Report

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

buremba commented May 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 17, 2026 •

edited

Loading