feat(schema): watchers — device_worker_id, agent_kind, notification, cooldown columns#811
Conversation
…cooldown columns Adds the columns needed for the device-aware watcher dispatcher (#798): * watchers.device_worker_id (uuid → device_workers.id) * watchers.agent_kind (text, nullable) * watchers.notification_channel (text NOT NULL DEFAULT 'canvas', CHECK in (canvas, notification, both)) * watchers.notification_priority (text NOT NULL DEFAULT 'normal', CHECK in (low, normal, high)) * watchers.min_cooldown_seconds (integer NOT NULL DEFAULT 0, CHECK >= 0) * watchers.last_fired_at (timestamptz, nullable) * device_workers.notification_budget_per_day (integer NOT NULL DEFAULT 10, CHECK >= 0) for the global per-device interrupt budget * idx_watchers_device_worker_id partial index Ships as both a dbmate migration (db/migrations/20260517060000_*) and the matching idempotent embedded-schema patch (watcher-schema-additions) so fresh PGlite installs and already-initialised embedded databases converge to the same shape. manage_watchers accepts and persists the new fields, with enum/non-negative validation enforced via TypeBox at the tool layer. Closes #799
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughThis PR extends the watchers table with device-worker routing and notification configuration. It adds a ChangesWatcher notification and device-worker routing
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Comment |
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 98378bddd4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| device_worker_id: Type.Optional( | ||
| Type.Union([Type.String(), Type.Null()], { | ||
| description: | ||
| '[create/update] Optional device worker UUID to pin this watcher to (when its inputs live on that device). Null clears the pin.', |
There was a problem hiding this comment.
Propagate watcher device pins into queued runs
When callers set device_worker_id on a watcher, the pin is only stored on watchers; createWatcherRun() still builds runs.approved_input without that field, while the server dispatcher only skips device-bound watcher runs by checking approved_input->>'device_worker_id'. As a result, scheduled or manually triggered runs for a pinned watcher are still claimed and executed by the server-side dispatcher instead of being left for the user's device worker.
Useful? React with 👍 / 👎.
* feat(watchers): device-pinned watcher runs end-to-end (#798 PR-1) The schema work in #811 added watchers.device_worker_id + agent_kind + notification + cooldown columns, and #808 made the server-side dispatcher skip runs already pinned to a device. This PR wires the pinning + claim + completion sides so a user's Lobu Mac app can actually execute the run via a local CLI agent (Claude Code, etc.). Server-side: 1. `materializeDueWatcherRuns` now reads `watchers.device_worker_id` and `watchers.agent_kind` and persists both into the run's `approved_input` JSONB. The existing #802 dispatcher exclusion keys off `approved_input->>'device_worker_id'`, so device-pinned rows stay `pending` on the server side. 2. `/api/workers/poll` gains a parallel CTE branch for `run_type='watcher'` AND `approved_input->>'device_worker_id' = <this device's id>`. The poll response is short-circuited for watchers — no connector code, no credentials, no compiled_code lookup. It returns a watcher payload envelope `{ watcher, event, context }` that the device-side dispatcher uses to build a CLI prompt. 3. New endpoint `POST /api/workers/me/runs/:runId/complete-watcher`: authorizes via the existing claim-ownership gate (`authorizeRunForWorker`), writes a `watcher_windows` row with `model_used='device-cli'` and `extracted_data.kind='device_cli_output'`, updates `runs.status` to completed/failed, advances `watchers.last_fired_at`, and emits a `watcher:updated` lifecycle event so dashboard metric_series picks up the run. 4. The new path is whitelisted in the user-scoped worker route gate (was previously 403 for `/api/workers/me/...` outside auth-profiles/feeds). Tests: - materializeDueWatcherRuns persists device_worker_id + agent_kind. - End-to-end complete-watcher → runs.completed, watcher_windows row with the CLI output, last_fired_at advanced. - Failure path: error supplied → runs.failed, no window row. - 409 for non-watcher run types, 404 for unknown run ids. PR-2 (Owletto Mac app dispatcher + ClaudeCodeExecutor) consumes this contract. * fix(watchers): close pi review on device-side complete-watcher (#814) 5 issues from pi's review of #814's WatcherDispatcher / completeWatcherRun that all materialise on real device traffic: 1. (BLOCKER) Schedule advancement mismatch — completing a device watcher moved `last_fired_at` forward but never bumped `next_run_at`, so the scheduler tick re-materialised the same watcher every minute forever. Extract `advanceWatcherSchedule(sql|tx, watcherId)` from automation.ts (was `advanceWatcherScheduleAfterTerminalFailure`) and call it from both manage_watchers(action="complete_window") and the device complete-watcher endpoint after the in-transaction completion writes. 2. Device-identity binding — `claimed_by === body.worker_id` alone is spoofable across devices that share a user OAuth token: any other device with the token could claim a run under any worker_id, then a third caller with the same token could complete it. For user-scoped workers we now resolve the caller's `device_workers.id` from `(workerUserId, body.worker_id)` and require it to equal `approved_input.device_worker_id` (which materializeDueWatcherRuns already snapshotted from the watcher pin). Mismatch → 403. 3. Completion race — two concurrent POSTs both passed the unlocked `authorizeRunForWorker` status read, both opened a tx, both INSERTed a watcher_windows row; the loser's run-UPDATE saw 0 rows and the tx committed a phantom window. Now lock `SELECT ... FOR UPDATE` on the run row at the top of the tx; if status is no longer 'running', return 200 idempotently with `{idempotent: true}`. 4. Window-id allocation — drop the inline `COALESCE(MAX(id), 0) + 1` in favour of the codebase's shared `getNextNumericId(tx, 'watcher_windows')` helper (whitelisted, same pattern used by manage_watchers). 5. Malformed payload no longer stranded the run — validation that throws inside the tx rolls back to `running`, and there's no stale-run sweep today. Validate `approved_input.window_start` / `window_end` BEFORE the transaction; on bad input mark the run `failed` (so next_run_at advances normally) and return 400. Tests: three integration tests covering #1, #3, and #5. The user-scoped spoof test (#2) needs OAuth device-flow setup that isn't wired up in the current test fixtures — leaving it for a follow-up that lands the helper alongside other user-scoped worker tests. * fix(watchers): pi round-2 — bind workerId from auth, race-free id alloc, no double-advance A) Bound-worker check now reads `c.var.mcpAuthInfo?.workerId` (PAT/OAuth token binding), not body.worker_id. Same-user attacker can't complete as another registered worker by lying in the payload. B) `getNextNumericId` takes a per-table `pg_advisory_xact_lock` (`hashtext('<table>_id_alloc')`). Inside `completeWatcherRun`'s tx the lock is held until commit, so two concurrent completions on different watcher runs serialize on allocation instead of racing on MAX(id)+1. C) Validation-failure path only advances the schedule when the `UPDATE … WHERE status='running'` actually matched a row (RETURNING-gated). Two concurrent malformed POSTs no longer double-tick `next_run_at`. Tests: - device spoof (Fix A): token bound to worker A, run pinned to worker B, body posts worker-B — asserts 403 and run stays running. - concurrent allocation (Fix B): two completions on different watchers fired in parallel — both 200, distinct watcher_windows.id. - double-advance (Fix C): two malformed POSTs against the same run — next_run_at advances only once.
#823) Goals (added in #813) had no behavior of their own — just a nullable FK from watchers.goal_id into a parallel `goals` table plus a parallel CRUD surface. Agents already encapsulate the watcher-grouping use case via watchers.agent_id; goals were the redundant layer. The Mac app stopped using the primitive in lobu-ai/owletto#151, and the primitive never shipped in a release (v7.0.0 doesn't include it; v7.1.0 is still open). Removed: - db/migrations/20260517160000_drop_goals_primitive.sql (drops the watchers.goal_id column and the goals table; reversible) - packages/server/src/db/embedded-schema-patches.ts: drop the `goals-primitive` patch entry - packages/server/src/tools/admin/manage_goals.ts and its registration - packages/server/src/sandbox/namespaces/goals.ts (client.goals SDK) and its method-metadata entries - goal_id from manage_watchers.ts (create/update/list), get_watchers.ts, and WatcherMetadata - manage_goals from auth/tool-access scope tables - goals-crud integration test Untouched: the dispatcher (#814), the watcher schema additions from #811, and packages/owletto (separate submodule).
Summary
Adds the columns needed for the device-aware watcher dispatcher work in #798, per the schema spec in #799.
watchersadditionsdevice_worker_iduuid→device_workers(id)NULLON DELETE NO ACTION— the dispatcher must decide whether to clear or reroute orphaned pins.agent_kindtextNULLbackground,notifier, …).notification_channeltext NOT NULL'canvas'CHECK IN ('canvas','notification','both').notification_prioritytext NOT NULL'normal'CHECK IN ('low','normal','high').min_cooldown_secondsinteger NOT NULL0CHECK >= 0.last_fired_attimestamptzNULLPlus
CREATE INDEX idx_watchers_device_worker_id ON watchers (device_worker_id) WHERE device_worker_id IS NOT NULLso the dispatcher's "watchers pinned to device X" query stays cheap.device_workersadditionsnotification_budget_per_day integer NOT NULL DEFAULT 10withCHECK >= 0— the per-device daily global interrupt budget. 10/day is a placeholder default; tune once the dispatcher exists.Where it ships
db/migrations/20260517060000_watcher_schema_additions.sql— forward + down for prod / existing dev databases.EMBEDDED_SCHEMA_PATCHES['watcher-schema-additions']— idempotent mirror for already-initialised PGlite installs (duplicate_column-tolerant +IF NOT EXISTSeverywhere).db/schema.sqlregenerated viadbmate dump+scripts/normalize-schema.sh;schema_migrations.versionlength restored tovarchar(128)(local pg_dump strips it).manage_watchersaccepts the new fields with TypeBox enum / non-negative validation on the create + update paths, and the INSERT/UPDATE SQL persists them.create_from_versioninherits DB defaults — fine for the spec's "existing CRUD continues to work when new columns are at defaults".Verification
make build-packages+bun run typecheckclean.dbmate upran clean on a new local Postgres;dbmate rollbackthendbmate upround-trips cleanly.start-local.bundle.mjswithLOBU_ALLOW_EPHEMERAL_ENCRYPTION_KEY=1, isolatedLOBU_DATA_DIR): comes up clean, columns + constraints + index present (verified by queryinginformation_schema.columnsandpg_constraint).watchers+device_workersskeleton — no errors, final shape stable.scheduled-jobsembedded patch onmain(it re-adds a single-column FK after the per-org PK swap dropped the unique constraint that backed it). Reproduces onmain@a88d8401without these changes, so it's out of scope for this PR.Follow-ups
notification_budget_per_day = 10is the right default or whether it should be plan-derived.Closes #799
Summary by CodeRabbit