feat(watchers): edit propagates across the group; snapshot version_id on runs#485
Conversation
… on runs A watcher "group" (sibling rows linked by `watcher_group_id`) was already supposed to share its versioned config — prompt, schema, template, classifiers, reaction script — but nothing in the write path enforced that. Each new assignment created via `create_from_version` got its own duplicate `watcher_versions` row, so editing one assignment silently forked it from the rest. This PR fixes the write contract: - `manage_watchers.create_from_version` no longer inserts a fresh `watcher_versions` row. The new assignment shares the source's `current_version_id` directly. `reaction_script` is copied from the source watcher (it lives on the watchers row, not on the version row, so it has to be propagated explicitly). - `manage_watchers.create_version` now writes the new version row owned by the group root (`watcher_id = watcher_group_id`) and bulk-updates every assignment in the group: `current_version_id`, `version`, and `name` cascade group-wide. Per-assignment fields (`sources`, `schedule`, `scheduler_client_id`) still update only the row the caller named. - `manage_watchers.set_reaction_script` cascades across `watcher_group_id`. Resetting the script on any assignment clears it for the whole group. - `get_watchers` reads versions through the group root: legacy `template_version` now resolves via `sv.watcher_id = i.watcher_group_id`, and a new `template_version_id` arg lets the worker run loop request an exact `watcher_versions.id` (used to read the snapshotted version, not the freshly-edited one). The cascade introduces a race: an extraction in flight can have its schema validated against a newly-edited version and silently corrupt the window. The fix: - `WatcherRunPayload` gains a `version_id` field. `createWatcherRun` snapshots `watchers.current_version_id` at INSERT time. - `complete_window` reads `version_id` from the run row and joins the version by id, ignoring `watchers.current_version_id`. The snapshot is also written into `watcher_windows.version_id` (column existed, was never populated until now). Entity hard-delete (`utils/entity-management.ts`) hard-deletes watchers and the `watcher_versions.watcher_id` FK has `ON DELETE CASCADE`. In the new shared-chain world that would wipe out the version row used by surviving siblings — so before each `DELETE FROM watchers` call we transfer ownership: if the row being deleted is a group root with surviving siblings, repoint `watcher_versions.watcher_id` and `watchers.watcher_group_id` to the smallest surviving sibling. Tests: 5 new cases under `__tests__/integration/watchers/group-edit.test.ts` (reuse, script copy, group cascade, reaction cascade, run snapshot holding through a mid-run group edit). Full watcher integration suite green on real Postgres (13/13). Frontend (Edit button on `watcher-group-detail.tsx`) ships in owletto-web PR #57. The submodule pointer bump will follow once that PR merges.
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4b413bf7d7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| id, name, slug, organization_id, entity_ids, | ||
| schedule, next_run_at, agent_id, scheduler_client_id, model_config, sources, version, | ||
| current_version_id, tags, status, created_by, created_at, updated_at, | ||
| watcher_group_id, source_watcher_id, | ||
| reaction_script, reaction_script_compiled |
There was a problem hiding this comment.
Preserve per-watcher version history on assignment
create_from_version now creates a watcher that points at a shared current_version_id but no longer creates any watcher_versions rows for the new watcher id. The rest of this file still has actions (upgrade, get_versions, and get_version_details when version is provided) that resolve versions with WHERE watcher_id = args.watcher_id, so newly assigned non-root watchers regress to "version not found" for those operations immediately after assignment.
Useful? React with 👍 / 👎.
| const previousVersion = Number(watcherRows[0].version); | ||
| const nextVersion = previousVersion + 1; |
There was a problem hiding this comment.
Derive next version from group history, not assignment row
In group mode, version rows are written under watcher_group_id, but nextVersion is still computed from the specific assignment's watchers.version. If that assignment was created from an older template version, nextVersion can be lower than the group's latest version, causing duplicate (watcher_id, version) inserts on watcher_versions (or non-monotonic numbering) when editing that assignment.
Useful? React with 👍 / 👎.
| const runRows = await sql` | ||
| SELECT (approved_input->>'version_id')::bigint AS version_id | ||
| FROM runs WHERE id = ${watcherRunId} LIMIT 1 | ||
| `; |
There was a problem hiding this comment.
Scope run-version snapshot lookup to the same watcher
The snapshot lookup trusts run_metadata.watcher_run_id and fetches approved_input.version_id by run id alone. If a stale or incorrect run id is passed, complete_window will validate against another watcher's schema and persist a foreign version_id into watcher_windows, which can corrupt window/version attribution; the run lookup should include the current watcher_id constraint.
Useful? React with 👍 / 👎.
The previous form
LEFT JOIN watcher_versions sv
ON ($3::bigint IS NOT NULL AND sv.id = $3::bigint)
OR ($3::bigint IS NULL
AND sv.watcher_id = i.watcher_group_id
AND sv.version = COALESCE($2::int, i.version))
caused 10s+ page-load timeouts in dev. Postgres' planner refuses to
use an index when a JOIN's ON clause is a top-level OR across
disjoint conditions; it falls back to seq scan on watcher_versions.
The replacement collapses the two cases into a single equality on
sv.id. When template_version_id is given, COALESCE returns it
directly (PK lookup). Otherwise it evaluates the inner subquery
which uses idx_watcher_versions_watcher_version (UNIQUE on
watcher_id, version). Both paths are now indexed.
EXPLAIN ANALYZE on the live group page query:
- JOIN slice: 0.105 ms
- Full Q-meta query: 0.296 ms
Watcher integration tests: 13/13 pass on real Postgres.
|
Triage decision: Reasons:
Next: Review blocking P1 issues identified in automated comments, verify test coverage for edge cases, and merge manually after final review Blocking comments: |
Pi found 6 issues in the group-edit refactor; this commit addresses
them all.
1. **Version snapshot was not actually plumbed through the worker.**
parseWatcherRunPayload (watchers/automation.ts) didn't include
version_id, so the new payload field was silently dropped before
reaching the agent prompt. The agent's read_knowledge call had no
version pin, and get_content joined watcher_versions on
i.current_version_id — so a v1 extraction validated against v2's
schema after a mid-run group edit. The race the previous PR
description claimed to fix was actually still wide open.
Fix:
- parseWatcherRunPayload extracts version_id (tolerating legacy
runs that have no field; falls back to current_version_id).
- The agent prompt instructs read_knowledge AND complete_window
with template_version_id when the snapshot is non-null.
- get_content accepts template_version_id and joins
`cv.id = COALESCE($pin, current_version_id) AND cv.watcher_id =
watcher_group_id` so it reads the snapshotted version, with an
ownership check that prevents pinning to another group's version.
- manage_watchers complete_window adds a template_version_id arg,
reads the run row's snapshot scoped by watcher_id, and joins
`wv.watcher_id = i.watcher_group_id` (ownership check).
2. **complete_window read/updated runs by id alone.** A wrong/stale
watcher_run_id in run_metadata could mark another watcher's run
completed against this watcher's window. Both queries now
AND watcher_id = ${watcherId}.
3. **get_watcher template_version_id had no ownership check.** Any
watcher_versions.id could be joined with no group/org constraint,
showing the wrong template/schema. JOIN now requires
`sv.watcher_id = i.watcher_group_id`.
4. **handleCreateVersion wrote version_sources from the seed
watcher's sources into the shared version row.** get_content
prefers version_sources over watchers.sources, so editing
one assignment's sources via the group sheet leaked into every
assignment's extraction context. Drop the version_sources write
(NULL it). get_content already falls through to watchers.sources
when version_sources is empty. Default for omitted args.sources
now reads from the seed watcher's watchers.sources instead of the
prior version row's vestigial version_sources.
5. **Concurrent create_version on the same group raced.**
nextVersion = previousVersion + 1 was computed before the tx;
two simultaneous edits both wrote v=N+1 and the unique
(watcher_id, version) index aborted one with a 500. Add a tx-
scoped advisory lock keyed by group id and re-read MAX(version)
under the lock; both edits now serialize cleanly to N+1, N+2.
6. **Ownership transfer on entity hard-delete could collide.**
Pre-refactor groups have parallel version chains; transferring
the deleted root's chain to a sibling that already has its own
chain would hit (watcher_id, version) unique. Add a NOT EXISTS
guard so transfer only happens when the chosen sibling has no
chain. Otherwise the root's chain cascade-deletes harmlessly
(siblings keep using their own current_version_id).
Tests: 4 new cases in group-edit.test.ts (legacy payload tolerance,
parsed version_id, run-scope-by-watcher-id, concurrent create_version
serialization). 17/17 watcher integration tests pass on real Postgres.
…leanup, #57) Picks up the UI side of the watcher group-edit refactor: - Edit button on the watcher group page that opens CreateWatcherSheet. - Removes the per-assignment Edit affordances (watcher-detail page pencil button, watchers-list row dropdown, entity-page dropdown menu item) so Edit is unambiguously a group-level operation. Pairs with the backend changes in this PR (cascade across watcher_group_id; version_id snapshot through the run lifecycle; ownership checks on get_watcher and complete_window).
|
Triage decision: Reasons:
Next: Human review required for submodule changes and P1 architectural concerns Blocking comments: |
Summary
Editing one watcher in a group (linked via
watcher_group_id) used to silently fork it from siblings:manage_watchers.create_from_versionduplicated thewatcher_versionsrow, and every per-assignment edit only touched its own chain. This PR makes the write paths group-aware:create_from_version: no longer inserts a freshwatcher_versionsrow. The new assignment shares the source'scurrent_version_id.reaction_scriptis now copied from the source watcher (it lives on the watcher row, not the version row).create_version: new version row is owned by the group root (watcher_id = watcher_group_id);current_version_id,version,namecascade across every assignment inwatcher_group_id. Per-assignment fields (sources,schedule,scheduler_client_id) still update only the named row.set_reaction_script: cascades acrosswatcher_group_id.get_watchers: reads versions through the group root; newtemplate_version_idarg lets workers request a specificwatcher_versions.id.The cascade introduced a race — a mid-flight extraction could have its output validated against a newly-edited schema. Closed with a
version_idsnapshot:WatcherRunPayloadgains aversion_idfield;createWatcherRunsnapshotswatchers.current_version_idat INSERT time.complete_windowreadsversion_idfrom the run row and joins the version by id, ignoringwatchers.current_version_id. The snapshot is also written intowatcher_windows.version_id(column existed, never populated until now).Entity hard-delete cascade: before each
DELETE FROM watcherssite inutils/entity-management.ts, version-chain ownership is transferred to a surviving sibling so deleting an entity that owns the group root can't wipe the version chain out from under the group.Coupled to
a4c6c8f); a follow-up commit will bump it once Refactor: Split large slack-thread-processor.ts into focused modules #57 merges, per the AGENTS.md two-PR rule.Test plan
bun run typecheckcleanmake build-packagescleanvitest run src/__tests__/integration/watchers/— 13/13 pass on real Postgres (5 new tests ingroup-edit.test.tscover: shared-version reuse, reaction-script copy on assign, group-cascade for create_version, group-cascade for set_reaction_script, run-time version_id snapshot surviving mid-run group edit)/<owner>/watchers/<id>, click Edit, change prompt → all assignments reflect the new prompt and version