feat(owletto-backend): multi-org execute + search MCP tools#348
Merged
Conversation
…arch Lands PR-1 of docs/plans/mcp-multi-org-and-execute.md: the in-process SDK surface, per-namespace delegations, membership cache, and isolated-vm runner skeleton. No new MCP tools yet — execute and search land in PR-2. - `src/sandbox/client-sdk.ts` — buildClientSDK + .org() accessor with per-call membership re-validation. AccessDenied/OrgNotFound errors surface the exact org and reason. - `src/sandbox/membership-cache.ts` — small LRU (cap 128, TTL 30s) keyed on (userId, slug-or-id). Catches revocations within the TTL while keeping cross-org walks cheap. - `src/sandbox/namespaces/*.ts` — 11 namespaces (entities, entitySchema, connections, feeds, authProfiles, operations, watchers, classifiers, viewTemplates, knowledge, organizations). Thin delegations over existing admin handlers; per-call authz runs inside the handlers. - `src/sandbox/method-metadata.ts` — summary/access/example keyed by SDK path. Seeded for PR-1; PR-2 adds CI coverage check. - `src/sandbox/typebox-to-signature.ts` — TypeBox → inline TS signature formatter used by the `search` tool. - `src/sandbox/run-script.ts` — isolated-vm loader + limits shape. Native module is an optionalDependency so bun install succeeds on platforms without a prebuild; Dockerfile adds python3 + build-essential so production builds work. Tests: 25 unit tests (bun:test) cover membership cache semantics, metadata shape, typebox formatter, and run-script loader fallback. Integration test (vitest) for .org() exercises real org/member tables with AccessDenied/OrgNotFound/public-org/LRU-shortcircuit/revocation paths.
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
…lidate cache
Addresses PR-1 review from Pi and my own gap analysis.
- Namespace payload alignment against real handler TypeBox schemas:
* watchers.delete → watcher_ids[]; setReactionScript → reaction_script;
completeWindow → window_token. watcher_id now string throughout.
* entitySchema.* uses plain `slug`; addRule takes
source/target_entity_type_slug + relationship_type_slug.
* operations.execute.watcher_source.window_id is a number.
* authProfiles keyed by slug, not numeric id; connections.connect takes
auth_profile_slug; feeds.create requires feed_key.
* viewTemplates.resource_id accepts string | number; classifiers uses
classifier_id for CRUD, classifier_slug for classify.
- Slug-vs-id resolution now tries slug first, falls back to id on miss.
Drops the ID_RE regex that false-positived 20+ char slugs.
- Consolidated on MultiTenantProvider's memberRoleCache (explicit
invalidation on member-table writes). Separate sandbox LRU removed.
Exposes getCachedMembershipRole / getCachedOrgBySlug / getOrgById.
- client.query() drops the dead `params` arg — validateAndScopeQuery
rejects user positional parameters and the old wrapper's merge was
unsafe. Metadata example updated.
- AccessDeniedError + OrgNotFoundError share a SdkError base.
- method-metadata.ts covers every namespace method; the unit test
asserts coverage so drift shows up in CI.
- New integration test that dispatches real read-path methods through
every namespace (catches the class of bug that static casts hid).
- vitest.config.ts + migrations-dir walk-up so `OWLETTO_TEST_BACKEND=pglite`
actually runs the integration suite. PGlite-incompatible handler queries
(sql template-fragment composition, prepared-statement mismatches)
run under real Postgres via a small pgOnlyIt helper.
…egration runs
Second Pi review caught four more wrapper lies:
- authProfiles.create/update now mirror the handler: `profile_kind` (not
`auth_type`), `display_name` required on create, `credentials` +
`auth_data` instead of a generic `config`.
- viewTemplates.set exposes a single `json_template` (which may nest
`data_sources`) instead of two separate fields; rollback takes
`version` (a number) not `version_id`.
- connections.updateConnectorAuth uses `auth_values` (the handler
contract), not `auth_config`.
- connections.installConnector drops the required `connector_key` —
handler picks from `source_url` / `source_uri` / `source_code` /
`mcp_url` and accepts optional `auth_values`.
- Added app_auth_profile_slug + config to connections.{create, connect, update}
for parity with the handler.
vitest.config.ts: force singleFork pool so the two integration files
can't stomp each other's fixtures via cleanupTestDatabase when run
in parallel.
…ile_slug`)
Third-pass Pi review caught the asymmetry: the handler's create_auth_profile
action reads `args.slug` to set the new profile's stable identifier, while
get/test/update/delete read `auth_profile_slug` to identify an existing
profile. Wrapper now mirrors the handler verbatim:
- create: takes optional `slug` + `display_name` + `profile_kind` +
`credentials`/`auth_data`/`requested_scopes`.
- update: takes `auth_profile_slug` (required, identifies the row) + all
the mutable fields, including optional new `slug` for renames.
- get/test/delete: positional `auth_profile_slug`.
- list: adds connector_key/provider/profile_kind filters for parity.
Without this fix, `client.authProfiles.create({ slug: 'primary' })` would
silently auto-derive the slug from display_name, and follow-up
get('primary') would miss. Typed contract now matches runtime.
Pi final-pass nit. Per the repo's convention (AGENTS.md §"When fixing unused-parameter errors, delete the parameter rather than prefixing with `_`"), drop `_env` from buildOrganizationsNamespace. Call site in client-sdk.ts adjusted.
…folding Plan content was originally on docs/mcp-multi-org-plan (PR #346) but that PR was overhead for a solo land — folding the doc into the same PR as its first implementation step. PRs 2–5 will reference this file.
4 tasks
No callers anywhere — was added speculatively for the search tool's preamble and the web console's header chip (PR-2/PR-5 territory). Drop it now; it'll come back when there's an actual call site to motivate its return shape. Also: tried collapsing the per-namespace XxxNamespace interfaces into ReturnType<typeof buildXxx> in client-sdk.ts to save ~250 lines, but TypeScript strict mode requires those interfaces to type the inline arrow params inside each builder. The interfaces are load-bearing, not ceremonial. Keeping them.
…itialized CI's `bun install --frozen-lockfile` step failed because the lockfile was committed without the submodule deps. Re-runs with the submodule checked out produce 543 added lock entries (mostly owletto-web's React/Tailwind tree). No source changes.
…ral output CodeQL high-severity flag on src/sandbox/typebox-to-signature.ts:104: formatLiteral escaped single quotes but not backslashes. A string like \`path\\to\\file\` would emit \`'path\\to\\file'\`, which is an invalid TS string literal (\\t is a tab, \\f is a form feed). A trailing backslash produced an unterminated literal \`'\\'\`. Fix order: escape backslashes first (so the \\\\ they become don't collide with the \\' produced by the next pass), then escape single quotes. Two regression tests cover backslash + quote paths.
…s to ClientSDK
PR-2 of docs/plans/mcp-multi-org-and-execute.md. Folded into the same
PR as PR-1 (the SDK scaffolding) since neither delivers user-visible
functionality alone.
External MCP surface — what changes for clients
-----------------------------------------------
- `execute(script)`: run TypeScript over the typed `ClientSDK`. The
`client` global exposes namespaces (entities, watchers, knowledge,
organizations, …) plus `client.org(slug)` for cross-org calls,
`client.query(sql)` for read-only SQL, `client.log(...)`. Per-method
authz fires inside the SDK; entry-point gate is write-tier.
- `search(query)`: discover SDK methods. Namespace listing for "watchers",
drill-down with signature/throws/example for "watchers.create".
- `list_organizations` + `switch_organization`: now exposed on both
unscoped /mcp and scoped /mcp/{slug} endpoints.
- Removed from external surface: the 14 `manage_*` tools,
`list_watchers`, `get_watcher`, `read_knowledge`, `join_organization`.
Their handlers are kept and re-registered as `internal: true` so the
in-process test harness can still dispatch them by name without forcing
every test to migrate to script-based calls. Production agents go
through the SDK.
Implementation
--------------
- `src/sandbox/run-script.ts`: replaces the PR-1 stub with the real
`isolated-vm` bridge. Compiles TS via the existing esbuild pipeline,
builds a single async dispatch reference, reconstructs `client` guest-
side via Proxy, captures structured logs + return value, enforces
memory/CPU/quota/output limits, maps errors to TimeoutError /
QuotaExceeded / OutOfMemory / ScriptError / CompileError.
- `src/sandbox/method-metadata.ts` is the data source for `search`.
- `src/types/isolated-vm.d.ts`: minimal type shim so TS compiles when the
optional native module isn't installed (Node 25 etc.). Production
Docker installs the real module.
- `src/watchers/reaction-executor.ts`: dropped QuickJS + the flat
ReactionSDK env-bridge. Reactions now go through `buildClientSDK` +
`runScript`. Stored reaction scripts that export `react(ctx, sdk)`
keep working — the runner accepts both `default` and `react` exports.
- Dropped deps: `@sebastianwessel/quickjs`,
`@jitl/quickjs-ng-wasmfile-release-asyncify`.
- `src/tools/execute.ts`: dropped scopedToOrg gating on org tools, dropped
the `join_organization` carve-out.
- `src/auth/tool-access.ts`: kept the per-action policy table — it
applies to internal-tool dispatches the same way it did to external.
Added `execute` (write), `search` (public-readable).
Test status
-----------
- Unit (bun:test): 27/27 sandbox + 32/32 tool-access pass.
- Integration: tests that go through the test harness against legacy
`manage_*` names keep working via the `internal: true` registration.
Other long-standing integration failures (~190) are pre-existing
pglite/worktree environment issues, not from this PR. CI will run
against real Postgres.
- Removed: `__tests__/integration/mcp/public-org-join.test.ts` —
exclusively tested the dropped `join_organization` MCP tool.
…` fallback
User direction: the new MCP surface should be consistent everywhere; no
back-compat needed since stored data (reaction scripts, agent prompts)
will be migrated manually in production.
- Removed `internal: true` re-registrations of the 14 manage_* tools +
list_watchers / get_watcher / read_knowledge. Deleted
`src/tools/admin/index.ts` (the ADMIN_TOOLS aggregator) entirely.
- Removed the `module.exports.react` fallback in run-script.ts. Stored
reaction scripts must export `default async (ctx, client, params?)`;
legacy ones need a one-time DB migration.
- Updated workspace-instructions to point agents at `execute` /
`client.<ns>.<method>` patterns instead of legacy tool names.
- Updated openclaw plugin's "already authenticated" hint text.
- auth.test.ts: removed the "hides org switching on scoped routes"
assertion (org tools are now exposed everywhere), updated tool-list
assertion to expect the new clean surface.
- tool-access.test.ts: replaced one assertion that referenced
manage_connections (now non-existent) with query_sql which is the
canonical admin-only tool on the new surface.
Frontend (`packages/owletto-web`) ships ~50+ `apiCall('manage_*', ...)`
calls through REST — those break with this change. Tracked as a separate
submodule PR to migrate them to `apiCall('execute', { script: ... })`.
Integration tests that still use `mcpToolsCall('manage_*', ...)` will
fail with `Tool not found` — those require per-test migration to either
direct handler imports or `execute`-script form, which is out of scope
for this PR. The single-test-file migration pattern is intentional so
each test owner can review the new SDK call shape.
Self-review and Pi senior-engineer pass surfaced a privilege-escalation
gap and a handful of stale-after-cold-drop issues. Fixes here only — the
deploy-time prerequisites (reaction-script DB migration, owletto-web
caller migration) remain tracked as follow-ups in the PR description.
Security
- `client.query()` now enforces owner/admin role to match the `query_sql`
MCP tool. Without this, write-tier users could bypass the admin gate by
calling `execute({script: "client.query(...)"})` and read audit/event
tables exposed by the validated allowlist.
- `list_organizations` / `switch_organization` now require the `mcp:read`
scope; previously the org-agnostic branch returned before the scope
check, letting profile-only OAuth tokens call them.
- `execute` MCP tool annotated `destructiveHint: true` so host clients
prompt for approval — scripts can delete entities, drop watchers, and
trigger external operations.
Sandbox hardening (run-script.ts)
- Guest-side proxy short-circuits `then`, `catch`, `finally`, Symbol keys,
`constructor`, `__proto__`, `inspect` — accidental `await client.foo`
no longer turns into a host SDK call against `foo.then`.
- Host dispatcher now requires the namespace and method to be own
enumerable keys of the resolved SDK target. Belt-and-braces with the
guest-side filter; closes the avenue where `client.constructor.assign`
would resolve to `Object.assign` on the host.
- Script return value (`returnJson`) is now charged to `outputBytes`, so
scripts can't bypass the 256KB output cap by returning a large object.
Stale references after the cold-drop
- reaction-executor.ts header no longer claims it accepts the legacy
`react` export — that fallback was removed in 9dfd372.
- tools/search.ts no-match suggestion text and the `search_knowledge`
module header now point at `execute` + `client.<ns>.<method>` patterns
instead of the removed `manage_*` MCP names.
- auth.test.ts assertions for public-org tools/list dropped the now-stale
expectations on `manage_entity` being externally listed.
- run-script.test.ts no longer asserts the runner is a stub; the bridge
is real now. Test gracefully skips on systems without `isolated-vm`
installed.
Sweep prompted by "are all docs everywhere consistent?" — turns out no: several docs and a couple of dead type exports were still describing the legacy `manage_*` / `ReactionSDK` world. Surgical fixes only; the owletto-cli runtime callers (separate scope) are added to the PR's follow-up section, not patched here. - packages/owletto-sdk: drop the dead `ReactionSDK` and `ReactionHandler` type exports — only `ReactionContext` (still imported by the new reaction-executor) and `ReactionEntity` remain. The runtime SDK that reactions actually call is `ClientSDK`, defined in owletto-backend. - packages/owletto-backend/src/auth/tool-access.ts: clarify the comment next to the `manage_entity` policy entry. The handlers are no longer registered as `internal: true` MCP tools — they're unregistered, but the policy table still drives `routeAction`'s per-action checks when SDK wrappers reach them from inside `execute`. - docs/plans/mcp-multi-org-and-execute.md: the "Existing reaction scripts in DB" risk bullet claimed `ClientSDK` keeps the same method paths and entry-point shape as `ReactionSDK`. That's wrong (`content. save` → `knowledge.save`, `actions.execute` → `operations.execute`, no `notify`, new `default async (ctx, client, params?)` shape). Replace with the actual mapping plus the audit query that confirms the migration cleared the legacy patterns. - packages/landing/.../owletto-cli.md: the `owletto run` examples and "available tools" list still pointed at `read_knowledge`, `list_ watchers`, `get_watcher`, and the 14 `manage_*` names. Replace with the current external surface: `search_knowledge`, `save_knowledge`, `search`, `execute`, `query_sql`, `list_organizations`, `switch_ organization`, plus an `execute` script example. - packages/owletto-connectors/src/README.md: manual connector install instruction pointed at `manage_connections` — now points at `client.connections.installConnector(...)` from inside `execute` (or the equivalent admin REST endpoint).
Pi's second-pass review caught two real things to address before merge: 1. Sandbox guest proxy reserved-key list missed coercion probes. `JSON.stringify(client.entities)` and `console.log(client.entities)` would call `toJSON` / `toString` / `valueOf` on the proxy, get back a dispatcher closure, and end up firing a host SDK call against `entities.toJSON` (etc) — wasting quota and surfacing a confusing `Unknown SDK method` error. Add those three keys to `__isReservedKey()` alongside the existing `then`/Symbol/proto short-circuits in run-script.ts. 2. `compileReactionScript` and `runScript`'s internal recompile used different esbuild `external` lists. Reactions externalized `@owletto/reactions`, which the runtime recompile (with `external: []`) would then fail to resolve — save-time would accept code the runtime could not run. Align reactions to `external: []` so save-time and runtime accept the same set of imports. None of the migrated reactions in `_reactions_backup_2026_04_25` import `@owletto/reactions`, so this is a no-op for current production data.
Contributor
|
Triage decision: Reasons:
Next: merge manually after final review and CI completion |
…on MCP for CLI
The owletto-cli browser-auth flow drives connection setup and credential
exchange through MCP RPC (`packages/owletto-cli/src/commands/browser-auth.ts`).
After flipping all admin tools to `internal: true` those calls fail with
"Tool not found" because internal tools are filtered out of the MCP path
(see tools/execute.ts:85).
Re-expose just the two tools the CLI actively needs. The other admin
tools used by the CLI (`seed.ts`) hit the REST endpoint
`/api/{slug}/{toolName}` which bypasses the internal-flag filter, so
those don't need this treatment.
Once a CLI migration PR ports browser-auth to execute + client.* the
internal flag can be re-applied.
…folding # Conflicts: # bun.lock # packages/owletto-backend/src/tools/admin/index.ts
This was referenced Apr 26, 2026
buremba
added a commit
that referenced
this pull request
Apr 27, 2026
* fix(execute): run backend under Node so isolated-vm loads Bun uses JavaScriptCore and exposes a partial V8 ABI shim — enough that node-gyp-build picks the abi137 prebuild for isolated-vm at install time, but not enough that the addon dlopens at runtime. Loading throws 'undefined symbol: v8::ValueSerializer::Delegate::HasCustomHostObject'. The previous fix (#427) caught this and returned RuntimeUnavailable instead of crashing — but the execute MCP tool still silently failed in prod. Fix the underlying mismatch by running the backend under Node: the source layout, dependencies, and Hono server entry are already Node-style ('Node.js Server Entry Point' is in the docblock); the only Bun-specific thing was 'exec bun src/server.ts' in start.sh. Switch start.sh to 'exec node --import tsx src/server.ts'. tsx is already installed (devDep, copied into the runtime image via the existing node_modules pipeline) and handles ESM + TypeScript loading without a precompile step. Verified end-to-end inside the live prod image: node --import tsx loads isolated-vm, runs runScript() and returns the expected value. The regression silently shipped because the existing sandbox tests ran under bun test with a 'skipIfRuntimeUnavailable' helper that turned the load failure into a no-op pass. Replace with a vitest integration test that fails loudly when isolated-vm can't load, and wire it into CI invoking vitest under node directly. SKIP_TEST_DB_SETUP=1 keeps it fast — the test uses stub SDKs and doesn't need Postgres. The broader vitest suite (~38 integration files) is not yet wired into CI; many are stale after #348's MCP tool consolidation. Tracked separately. * ci: pin Node 22 so isolated-vm prebuild loads in test job * fix(start): keep cwd=/app, use absolute tsx loader path Codex review on #430 caught that 'cd /app/packages/owletto-backend' breaks bundled config resolution: gateway's ProviderRegistryService and the embedded agent routes both resolve 'config/providers.json' from process.cwd(). Under the previous cwd of /app, this finds /app/config/providers.json. Under /app/packages/owletto-backend, it resolves to a path that doesn't exist and the bundled provider registry is silently empty. Pass tsx as an absolute file URL so the loader resolves regardless of cwd, and run server.ts by absolute path. Verified in the live prod image: runScript() returns success with cwd preserved at /app.
5 tasks
buremba
added a commit
that referenced
this pull request
Apr 28, 2026
…ledge for REST (#434) * fix(owletto-backend): re-register list_watchers/get_watcher/read_knowledge for REST callers The frontend's watchers page broke with `Tool not found: list_watchers` because PR #348 deleted these MCP-tool registrations. The plan was to migrate everything to `execute` + `client.<ns>.<method>`, but the frontend (`apiCall('list_watchers'…)` in `use-watchers.ts`) and owletto-cli still need the named handlers via the REST proxy. The MCP boundary stays small. The REST/CLI boundary regains the names. - Re-register `list_watchers`, `get_watcher`, `read_knowledge` as `internal: true` so they're hidden from MCP `tools/list` but reachable via `POST /api/{slug}/{toolName}` (the REST proxy sets `allowInternalTools=true`). - Rename `LEGACY_ADMIN_TOOLS` → `INTERNAL_REST_TOOLS`. They're not legacy — they're the deliberate REST/CLI surface. Consolidate the 12 near-duplicate `ToolDefinition` literals into a compact `ENTRIES` array, and pass handlers to `defineTool` directly instead of wrapping each in a pass-through `async (args, env, ctx) => handler(args, env, ctx)`. Same simplification applied to the public `TOOLS` array in `registry.ts`. - Add `ToolNotRegisteredError` and throw it from `checkToolAccess` when `getTool()` returns undefined (registry/frontend drift). The REST proxy now captures it to Sentry so the next "Tool not found" outage surfaces as an alert instead of a silent 400. Internal-tool hiding still throws a plain Error to avoid leaking handler existence. - Regression test in `tool-access.test.ts`: scan `packages/owletto-web/src` for every `apiCall(<…>)('toolName', …)` call site and assert each name is registered. `manage_queue` is in a `KNOWN_DEAD_NAMES` allowlist (the only frontend caller is the unused `useDeleteWindow` hook); the test fails the day someone wires it up without first registering a backend handler. Verified: removing `list_watchers` from the registry locally now fails the new test with `Received: ["list_watchers"]`. * test(owletto-backend): cover REST-vs-MCP visibility for the re-registered internal tools Direct assertion that list_watchers, get_watcher, and read_knowledge: - look like 'Tool not found' to external MCP callers, and - are reachable from the REST proxy when allowInternalTools=true. Complements the apiCall-coverage scan with explicit access-control coverage so the visibility split is hard to silently regress. * test(owletto-backend): extend coverage scan to owletto-cli (REST + MCP surfaces) Both the frontend and the CLI use the same dispatch (`POST /api/:orgSlug/:toolName` → `restToolProxy` → `executeTool` → `getTool(name)`), but the CLI's browser-auth flow goes through MCP RPC and needs its tools to ALSO stay visible on `tools/list` (i.e. NOT `internal: true`). - Scan packages/owletto-cli/src for `callTool(ctx, 'X')` patterns and assert each name is registered (matches the frontend's REST surface). - Scan browser-auth.ts for `name: 'X'` literals and assert each is registered AND non-internal — flipping `manage_connections` or `manage_auth_profiles` to internal would silently break the CLI's browser-auth flow. Three first-party tool callers are now covered: owletto-web (frontend REST), owletto-cli/seed (CLI REST), owletto-cli/browser-auth (CLI MCP). * test+chore(owletto-backend): close test blind spots from Pi review - Frontend scanner missed hook-factory's `tool: 'X'` config form (api/entities.ts, api/connections.ts — 30+ sites). Combine apiCall(...) regex with a second tool: 'X' regex so removing any of those tools also fails the test. - CLI MCP test was filtering candidates to registered names before checking missing — meaning an outright deletion silently passed. Replace the regex-derived candidate set with a hardcoded `CLI_PUBLIC_MCP_TOOLS` allowlist, plus a separate drift detector that asserts browser-auth.ts's tools/call literals match the pinned set exactly. Verified the new test fails when manage_connections is flipped to internal:true. - Sentry tag cardinality: tool_name lives in `extra`, not `tags` — the URL segment is attacker-controlled and would blow up tag cardinality.
6 tasks
buremba
added a commit
that referenced
this pull request
Apr 28, 2026
- Split CI into unit / frontend / integration jobs so unit failures surface
in <2 min instead of waiting behind Postgres setup.
- Drop the worker cherry-pick (8 of 18 files); full 187-test suite passes.
- Wire owletto-web's 14 vitest files into CI via the workspace binary.
- Stand up a real pgvector/pg16 service for the owletto-backend integration
job so the vitest suite (previously 0 of 38 in CI) runs end-to-end under
Node 22. Add a fail-fast pgvector extension check.
Compose-able test client primitives (packages/owletto-backend/src/__tests__/setup/test-mcp-client.ts):
- TestMcpClient — full HTTP/JSON-RPC round-trip, exercises auth, session
init, and the isolated-vm sandbox. Adds orgSlug pinning so tools that
need org context land on /mcp/{slug} instead of unscoped /mcp.
- TestApiClient — direct namespace-handler calls. Same fluent surface
(entities, watchers, classifiers, knowledge, …), no HTTP hop, no sandbox.
Replace 33 stale post-#348 manage_* integration tests with 8 focused new
files using the primitives:
entity-schema/entity-types, entities/entity-crud, watchers/watchers-crud,
classifiers/classifiers-crud, cross-org/isolation, mcp/auth-wire,
sandbox/execute-wire. Same conceptual coverage in 1/3 the line count.
Sync 30 columns of QUERYABLE_SCHEMA drift detected by the existing
table-schema test — the test was a real bug catcher; the columns now
flow through query_sql validation.
Other fixes folded in: worktree-relative paths in migrations-format,
owletto-guidance-sync, connector-catalog; stale URL-builder assertions;
search-cross-org's missing workspace-provider init and stale .entities
field reference; mcp-install-targets label drift.
Test status (post-PR, on Linux Node 22):
37 owletto-backend vitest files / 389 tests pass
187 worker tests pass (full suite)
76 owletto-web tests pass
720 core/gateway/cli tests pass
5 skipped: 2 pre-existing bugs (PAT FK, classifier org_id), 3 documented
buremba
added a commit
that referenced
this pull request
Apr 28, 2026
- Split CI into unit / frontend / integration jobs so unit failures surface
in <2 min instead of waiting behind Postgres setup.
- Drop the worker cherry-pick (8 of 18 files); full 187-test suite passes.
- Wire owletto-web's 14 vitest files into CI via the workspace binary.
- Stand up a real pgvector/pg16 service for the owletto-backend integration
job so the vitest suite (previously 0 of 38 in CI) runs end-to-end under
Node 22. Add a fail-fast pgvector extension check.
Compose-able test client primitives (packages/owletto-backend/src/__tests__/setup/test-mcp-client.ts):
- TestMcpClient — full HTTP/JSON-RPC round-trip, exercises auth, session
init, and the isolated-vm sandbox. Adds orgSlug pinning so tools that
need org context land on /mcp/{slug} instead of unscoped /mcp.
- TestApiClient — direct namespace-handler calls. Same fluent surface
(entities, watchers, classifiers, knowledge, …), no HTTP hop, no sandbox.
Replace 33 stale post-#348 manage_* integration tests with 8 focused new
files using the primitives:
entity-schema/entity-types, entities/entity-crud, watchers/watchers-crud,
classifiers/classifiers-crud, cross-org/isolation, mcp/auth-wire,
sandbox/execute-wire. Same conceptual coverage in 1/3 the line count.
Sync 30 columns of QUERYABLE_SCHEMA drift detected by the existing
table-schema test — the test was a real bug catcher; the columns now
flow through query_sql validation.
Other fixes folded in: worktree-relative paths in migrations-format,
owletto-guidance-sync, connector-catalog; stale URL-builder assertions;
search-cross-org's missing workspace-provider init and stale .entities
field reference; mcp-install-targets label drift.
Test status (post-PR, on Linux Node 22):
37 owletto-backend vitest files / 389 tests pass
187 worker tests pass (full suite)
76 owletto-web tests pass
720 core/gateway/cli tests pass
5 skipped: 2 pre-existing bugs (PAT FK, classifier org_id), 3 documented
buremba
added a commit
that referenced
this pull request
Apr 28, 2026
- Split CI into unit / frontend / integration jobs so unit failures surface
in <2 min instead of waiting behind Postgres setup.
- Drop the worker cherry-pick (8 of 18 files); full 187-test suite passes.
- Wire owletto-web's 14 vitest files into CI via the workspace binary.
- Stand up a real pgvector/pg16 service for the owletto-backend integration
job so the vitest suite (previously 0 of 38 in CI) runs end-to-end under
Node 22. Add a fail-fast pgvector extension check.
Compose-able test client primitives (packages/owletto-backend/src/__tests__/setup/test-mcp-client.ts):
- TestMcpClient — full HTTP/JSON-RPC round-trip, exercises auth, session
init, and the isolated-vm sandbox. Adds orgSlug pinning so tools that
need org context land on /mcp/{slug} instead of unscoped /mcp.
- TestApiClient — direct namespace-handler calls. Same fluent surface
(entities, watchers, classifiers, knowledge, …), no HTTP hop, no sandbox.
Replace 33 stale post-#348 manage_* integration tests with 8 focused new
files using the primitives:
entity-schema/entity-types, entities/entity-crud, watchers/watchers-crud,
classifiers/classifiers-crud, cross-org/isolation, mcp/auth-wire,
sandbox/execute-wire. Same conceptual coverage in 1/3 the line count.
Sync 30 columns of QUERYABLE_SCHEMA drift detected by the existing
table-schema test — the test was a real bug catcher; the columns now
flow through query_sql validation.
Other fixes folded in: worktree-relative paths in migrations-format,
owletto-guidance-sync, connector-catalog; stale URL-builder assertions;
search-cross-org's missing workspace-provider init and stale .entities
field reference; mcp-install-targets label drift.
Test status (post-PR, on Linux Node 22):
37 owletto-backend vitest files / 389 tests pass
187 worker tests pass (full suite)
76 owletto-web tests pass
720 core/gateway/cli tests pass
5 skipped: 2 pre-existing bugs (PAT FK, classifier org_id), 3 documented
buremba
added a commit
that referenced
this pull request
Apr 28, 2026
* test(ci): make full test suite reliable across all packages
- Split CI into unit / frontend / integration jobs so unit failures surface
in <2 min instead of waiting behind Postgres setup.
- Drop the worker cherry-pick (8 of 18 files); full 187-test suite passes.
- Wire owletto-web's 14 vitest files into CI via the workspace binary.
- Stand up a real pgvector/pg16 service for the owletto-backend integration
job so the vitest suite (previously 0 of 38 in CI) runs end-to-end under
Node 22. Add a fail-fast pgvector extension check.
Compose-able test client primitives (packages/owletto-backend/src/__tests__/setup/test-mcp-client.ts):
- TestMcpClient — full HTTP/JSON-RPC round-trip, exercises auth, session
init, and the isolated-vm sandbox. Adds orgSlug pinning so tools that
need org context land on /mcp/{slug} instead of unscoped /mcp.
- TestApiClient — direct namespace-handler calls. Same fluent surface
(entities, watchers, classifiers, knowledge, …), no HTTP hop, no sandbox.
Replace 33 stale post-#348 manage_* integration tests with 8 focused new
files using the primitives:
entity-schema/entity-types, entities/entity-crud, watchers/watchers-crud,
classifiers/classifiers-crud, cross-org/isolation, mcp/auth-wire,
sandbox/execute-wire. Same conceptual coverage in 1/3 the line count.
Sync 30 columns of QUERYABLE_SCHEMA drift detected by the existing
table-schema test — the test was a real bug catcher; the columns now
flow through query_sql validation.
Other fixes folded in: worktree-relative paths in migrations-format,
owletto-guidance-sync, connector-catalog; stale URL-builder assertions;
search-cross-org's missing workspace-provider init and stale .entities
field reference; mcp-install-targets label drift.
Test status (post-PR, on Linux Node 22):
37 owletto-backend vitest files / 389 tests pass
187 worker tests pass (full suite)
76 owletto-web tests pass
720 core/gateway/cli tests pass
5 skipped: 2 pre-existing bugs (PAT FK, classifier org_id), 3 documented
* fix(ci): skip whatsapp unit test when connector compiler hasn't run
The whatsapp connector imports `npm:baileys@7.0.0-rc.9` (Deno-style),
which is rewritten by the connector compiler at install time. CI runs
unit tests via `bun test` without the compiler, so the import fails to
resolve. The test was never actually running in CI before — it landed
under my expanded `bun test packages/owletto-backend/src/__tests__/unit`
pattern, which exposed it.
Gate the `describe`s on `RUN_CONNECTOR_TESTS=1` and short-circuit the
beforeAll so the import doesn't run on CI. The integration test
(whatsapp-entity-links.test.ts) covers the broader entityLinks surface.
* fix(tests): address codex review — fix classifier bug, un-skip tests, add packages
Codex review of #437 surfaced concrete issues; addressed each:
- **classifier handler**: fixed two bugs in manage_classifiers.handleCreate
— missing organization_id on event_classifiers INSERT, and stale string
literal 'api' for created_by (FK to user.id). Pass ctx through routeAction
and use ctx.userId. Un-skipped the classifier integration test.
- **TestApiClient ToolContext**: now sets tokenType, scopedToOrg, and
allowCrossOrg so handlers gating on those fields (e.g. cross-org reads
via client.org()) see realistic values.
- **TestMcpClient comment**: dropped the misleading "both layers share
client.entities, etc." line — only TestApiClient exposes the SDK
namespaces. The two layers are deliberately not interchangeable.
- **tools/list skips**: the 5 anonymous/auth'd tools/list cases are now
rewritten to use mcpListTools() (which does the initialize handshake
first per spec) instead of being skipped. Added cookie + orgSlug
forwarding to mcpListTools/mcpRequest so the cookie-auth case works.
- **403-vs-400 skip**: un-skipped the public-org REST mutation test with
the correct 403 expectation. The 403 is right (forbidden, not malformed).
- **JSON-RPC org-context test**: rewrote to assert the actual auth-layer
behavior — anonymous tools/call returns HTTP 401 with a WWW-Authenticate
challenge, not a JSON-RPC -32001 error.
- **CI scope**: added packages/owletto-cli + packages/owletto-worker to
the unit job. owletto-openclaw e2e tests need a live backend (docker
compose) and stay in smoke-example.yml.
- **biome ignore**: ignore .connector-child-*.mjs subprocess temp files
the owletto-worker integration tests leave behind.
Net: 471 vitest tests pass (was 462), 1114 bun tests pass (was 1106).
The PAT FK skip stays — pre-existing schema bug, out of scope.
* test(owletto-backend): restore focused coverage contracts
* test(owletto-backend): restore high-value integration contracts
buremba
added a commit
that referenced
this pull request
May 26, 2026
Watcher reactions could persist knowledge but had no way to push a notification — so digests landed in the feed and never reached Slack. The legacy ReactionSDK had `notify`; PR #348 consolidated reactions onto the typed ClientSDK and carried `notify` over only as an agent-facing MCP tool, dropping it from the reaction surface. Restore it as a `notifications` namespace that reuses the existing `notify` tool handler -> createNotificationForUsers -> deliverToBotConnections, so a reaction can fan a digest out to the org's active bot connections (Slack/ Telegram) and the in-app inbox. `watcher_source` attributes the send to the firing window. - sandbox: new `notifications.send` namespace + method-metadata (write) + ClientSDK wiring; exposed on connector-sdk ReactionClient type. - examples/lobu-crm: funnel-digest + inbound-triage reactions now call client.notifications.send after persisting knowledge. - test: reaction script dispatching client.notifications.send through the isolate (run-script-runtime).
buremba
added a commit
that referenced
this pull request
May 26, 2026
Watcher reactions could persist knowledge but had no way to push a notification — so digests landed in the feed and never reached Slack. The legacy ReactionSDK had `notify`; PR #348 consolidated reactions onto the typed ClientSDK and carried `notify` over only as an agent-facing MCP tool, dropping it from the reaction surface. Restore it as a `notifications` namespace that reuses the existing `notify` tool handler -> createNotificationForUsers -> deliverToBotConnections, so a reaction can fan a digest out to the org's active bot connections (Slack/ Telegram) and the in-app inbox. `watcher_source` attributes the send to the firing window. - sandbox: new `notifications.send` namespace + method-metadata (write) + ClientSDK wiring; exposed on connector-sdk ReactionClient type. - examples/lobu-crm: funnel-digest + inbound-triage reactions now call client.notifications.send after persisting knowledge. - test: reaction script dispatching client.notifications.send through the isolate (run-script-runtime).
buremba
added a commit
that referenced
this pull request
May 26, 2026
…#1064) * feat(reactions): notifications.send on the reaction ClientSDK Watcher reactions could persist knowledge but had no way to push a notification — so digests landed in the feed and never reached Slack. The legacy ReactionSDK had `notify`; PR #348 consolidated reactions onto the typed ClientSDK and carried `notify` over only as an agent-facing MCP tool, dropping it from the reaction surface. Restore it as a `notifications` namespace that reuses the existing `notify` tool handler -> createNotificationForUsers -> deliverToBotConnections, so a reaction can fan a digest out to the org's active bot connections (Slack/ Telegram) and the in-app inbox. `watcher_source` attributes the send to the firing window. - sandbox: new `notifications.send` namespace + method-metadata (write) + ClientSDK wiring; exposed on connector-sdk ReactionClient type. - examples/lobu-crm: funnel-digest + inbound-triage reactions now call client.notifications.send after persisting knowledge. - test: reaction script dispatching client.notifications.send through the isolate (run-script-runtime). * feat(connector-sdk): export NotificationsSendInput (pi review) * fix(notifications): repair bot-connection delivery (dead since #846) deliverToBotConnections fetched GET /api/internal/connections (removed in #846) and POST /api/v1/messaging/send (also gone), so it hit `if (!connRes.ok) return` and silently delivered nothing — notifications never reached Slack/Telegram for any notification type. Rewrite it to resolve active chat connections + their channel bindings from Postgres and post in-process via a new ChatInstanceManager.postNotificationToChannel (a one-shot outbound post, not an inbound message that triggers an agent run). Every app pod loads every active connection at boot, so the locally-held instance can post regardless of which pod fired the notification — correct under N>1 replicas, no cross-pod routing. Best-effort per connection. Together with the reaction `notifications.send` namespace, a watcher reaction digest now reaches the agent's bound Slack channel. - chat-instance-manager: new postNotificationToChannel + 2 unit tests. - notifications/service: DB+manager delivery; drop dead HTTP path + unused getLobuServiceToken import. * test(notifications): integration test for bot-delivery target resolution Extract resolveBotDeliveryTargets (connections JOIN channel bindings) and cover it against a real DB: bound channel resolved, no-binding/inactive omitted, bare channel id prefixed, connectionId filter honored. Addresses the pi review's tests_adequate flag for the bot fan-out path. Also trims the notifications namespace header comment. * fix(notifications): lazily start the connection before posting (multi-replica) pi review caught a real N>1 hole: postNotificationToChannel only read this pod's in-memory instance map. A connection created or restarted on another replica is found in Postgres but has no live instance locally, so the post was skipped. Call ensureConnectionRunning() first — idempotent when already running, won't revive a stopped connection — so any pod that fires the notification can deliver. Adds lazy-start + stopped-connection unit tests. * refactor(notifications): rename to postMessageToChannel + post markdown Rename postNotificationToChannel -> postMessageToChannel (it's a generic outbound-post primitive; 'notification' is the caller's concern) and post { markdown } instead of { raw } so digests render in each platform's native format (Slack markdown_text) rather than being HTML-escaped. * feat(reactions): portable rich cards via chat CardElement Let notifications carry a rich card, not just markdown: notifications.send / the notify tool accept an optional `card` typed as the `chat` CardElement (reused, not a bespoke DSL), and postMessageToChannel takes chat's AdapterPostableMessage and posts it directly — no converter. The Chat SDK renders one card to each platform's native format (Slack Block Kit, Teams Adaptive Cards, Google Chat Cards), so it's portable everywhere. connector-sdk re-exports the CardElement type (chat as optional peer/dev dep, type-only — no runtime bloat). Adds a manager test that builds a card with the real chat primitives and asserts it posts as { card }. * fix(connector-sdk): don't leak `chat` into published types (pi blocker) Importing chat's CardElement into connector-sdk's public .d.ts made an external typecheck fail (TS2307) unless the consumer also installed chat — and making chat a hard dep would bloat every connector install. Keep the real chat types server-side (where the card is validated + rendered) and type the author-facing `card` as a plain serializable object. Drop the chat dep from connector-sdk; remove the unused Section import in the card test. * docs(notifications): note the ORDER BY binding-creation rationale (review nit)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR-1 + PR-2 of
docs/plans/mcp-multi-org-and-execute.md(folded into one PR — seedocs/plans/mcp-multi-org-and-execute.mdon this branch).What changes for MCP clients
execute(script)ClientSDKsearch(query)search_knowledge,save_knowledge,query_sql,resolve_pathlist_organizations,switch_organizationmanage_*+list_watchers,get_watcher,read_knowledgetools/list— call viaclient.<ns>.<method>()from insideexecutejoin_organizationswitch_organization; the underlying public-org join lives in RESTInternals
src/sandbox/client-sdk.ts—buildClientSDK+client.org(slug). Slug-first, id-fallback.AccessDenied/OrgNotFound/SdkErrorhierarchy. Membership goes throughMultiTenantProvider.memberRoleCache(explicit invalidation on member writes flows through).client.query()is owner/admin-only for user sessions (with anisSystemContextbypass for reactions) and double-gated by anmcp:adminscope check.src/sandbox/run-script.ts—isolated-vmrunner with hard limits (64 MB / 60 s / 200 SDK calls / 256 KB output). Compiles TS via the existing esbuild pipeline. Maps errors to TimeoutError / QuotaExceeded / OutOfMemory / ScriptError / CompileError. Guest-side proxy short-circuitsthen,catch,finally, Symbol keys,constructor,__proto__,inspect,toJSON,toString,valueOfso accidentalawait/JSON.stringify/console.logon a namespace proxy never fires a host dispatch. Host-side dispatcher requires own-property namespace+method.src/sandbox/namespaces/*.ts— 11 namespaces (entities, entitySchema, connections, feeds, authProfiles, operations, watchers, classifiers, viewTemplates, knowledge, organizations) audited against handler TypeBox schemas (Pi caught & we fixed three rounds of payload drift).src/sandbox/method-metadata.ts— data source forsearch. CI-asserted coverage.src/watchers/reaction-executor.ts— dropped QuickJS + flat env bridge. Reactions now run on the samerunScriptpath. esbuildexternallist aligned withrunScript(external: []) so save-time and runtime accept the same set of imports. Stored scripts mustexport default async (ctx, client, params?)and use theClientSDKsurface — see migration status below.src/types/isolated-vm.d.ts— minimal type shim for the optional native module so TS compiles on Node versions without a prebuild. Production Docker installs the real module viapython3 + build-essentialadded todocker/app/Dockerfile.isolated-vmisoptionalDependencies; QuickJS deps removed.manage_*handlers stay registered asinternal: truein PR-1; commit9dfd3727deleted the internal registrations entirely, so the only callers now are the SDK delegations fromexecuteand the in-process integration test harness. A subsequent commit (e513aeb9) propagatesscopesthroughToolContextand re-checks them at the SDK action level so write/admin gates still fire when handlers are reached from insideexecute.✅ Deploy prerequisites — already executed
The cold-drop in
9dfd3727matches the user direction "no back-compat; data will be migrated manually." Both prerequisites have now been run against thesummaries-dbPostgres:watchers.reaction_scripthave been rewritten from the legacyreact(ctx, sdk)+ReactionSDKsurface todefault async (ctx, client)+ClientSDK. Mapping:sdk.notify(title, body)→client.knowledge.save({entity_ids, content, semantic_type: 'notification', title})(lossy: data preserved as a knowledge event, no real delivery primitive — downstream can subscribe tosemantic_type='notification'if push delivery is wanted).sdk.content.save→client.knowledge.save.sdk.actions.execute→client.operations.execute({connection_id, operation_key, input, watcher_source})withwatcher_sourceprovenance restored.sdk.log→client.log. Bothreaction_script(TS) andreaction_script_compiled(esbuild CJS, same options as the newcompileReactionScript) updated atomically. Originals backed up to_reactions_backup_2026_04_25for one-statement rollback.agents.{soul_md,user_md,identity_md}againstmanage_*,list_watchers,read_knowledge,get_watcher,join_organization. Zero hits. Nothing to migrate.Pi pushed back on deferring the
packages/owletto-climigration: published CLI paths (owletto seedandowletto browser-auth) call removed legacymanage_*tool names and will fail with HTTP 400 /Tool not foundonce this PR deploys. Specifically:src/commands/seed.ts— 6 callsites via/api/{org}/{tool}REST proxy hittingmanage_entity_schema,manage_entity,manage_watchers.src/commands/browser-auth.ts— 4 callsites via MCPtools/callhittingmanage_connections,manage_auth_profiles.Pi recommends either rewriting these in this PR (or a stacked prerequisite PR), or keeping temporary internal compatibility registrations until the CLI PR lands. The latter conflicts with the explicit "no back-compat" direction; the former adds non-trivial scope. Three concrete paths:
feat/owletto-cli-migratePR off this branch that rewrites the CLI callsites to useapiCall('execute', { script: ... }). Land it before this PR, or together. Cleanest, no scope creep on this PR.npx owletto seedbetween merges will see a 400.Default if no objection: (a) — open the stacked PR before merging this one.
Other follow-up work (tracked, not blocking this PR)
packages/owletto-websubmodule. ~50apiCall('manage_*', …)REST calls insrc/lib/api/{entities,connections,content}.tsandsrc/hooks/use-watchers.tsneed to be rewritten to either direct REST endpoints orapiCall('execute', { script: … })form. Per the AGENTS.md two-PR rule, ship a submodule PR first, then a parent bump PR.packages/owletto-backend/src/__tests__/integration/**usemcpToolsCall('manage_*', …). They keep working today because the test harness reaches handlers in-process even though the tools are no longer intools/list. Migrate to either direct handler imports orexecuteScript({ script: … })form for clarity. Mechanical, not blocking.ReactionSDKauto-tracked{ watcher_id, window_id }on entity / content / operation writes viatrackWatcherReaction. The newClientSDK-backed reaction path drops the auto-attribution; the migration restored it for the soleclient.operations.executecallsite (watcher 204) by passingwatcher_sourceexplicitly, but other entity/knowledge writes from reactions no longer carry watcher provenance. Re-thread it through the SDK constructor (or via a thinWatcherClientSDKwrapper that proxies the namespace methods and tags writes) before the next watcher-attribution feature lands.packages/worker/src/__tests__/embedded-tools.test.ts— still usesmanage_connectionsas a fixture name in a tool-list mock. Test-only, doesn't ship; update when convenient..github/workflows/ci.ymldid not runpackages/owletto-backendtests at all until commita81ecd6b/004582aa. The integration suite for this PR (auth.test.ts, sandbox/*) now runs in CI; broader owletto-backend coverage is still a separate task.Note on @lobu/owletto-sdk type-export removal
This PR removes
ReactionSDKandReactionHandlerfrom@lobu/owletto-sdk's public type exports (the runtime SDK reactions actually call isClientSDK, defined in owletto-backend; the type aliases pointed at a runtime that no longer exists).@lobu/owletto-sdkis at v4.2.0, so this is a TypeScript-level breaking change for downstream importers. Per repo policy ("no@deprecatedtags — just delete the old thing"), this is intentional. Release-please should cut the next@lobu/owletto-sdkrelease as a major version bump; flag in the release commit / changelog.Validation
dc089f73(after pi-2 fixes).pgOnlyIt. Now wired into CI viabun test packages/owletto-backend/....SELECT COUNT(*) FROM watchers WHERE reaction_script ~ 'export\s+async\s+function\s+react\s*\(' OR reaction_script ~ 'sdk\.(notify|content|actions)';returns 0. Backup table_reactions_backup_2026_04_25holds 6 rows.client.queryadmin bypass + scope-check skip on org-agnostic tools + sandbox dispatcher hardening + stale assertions — addressed ine5614f45.e5614f45+5c1a40fd) found proxy-coercion-key gap (toJSON/toString/valueOf) + reaction/runScriptexternallist drift — addressed indc089f73. Pi also pushed back on the owletto-cli follow-up call (see "Open question" above) and flagged the type-export removal as a semver-major change for@lobu/owletto-sdk.Stacked at
feat/mcp-sandbox-scaffolding. Plan:docs/plans/mcp-multi-org-and-execute.md. Pre-merge skip-size-check label applied (this is one concern: cross-org execute/search via SDK; splitting would have left PR-1 as dead scaffolding).