docs(plans): multi-org execute + search MCP tool plan by buremba · Pull Request #346 · lobu-ai/lobu

buremba · 2026-04-24T16:00:52Z

Summary

Addendum to docs/mcp-search-execute-design-doc.md covering cross-org addressing inside execute, frontend surface, edge cases, and a 5-PR land order.
Locks in: TypeScript + isolated-vm (not bash), client.org() accessor, execute as write-tier (not admin-only), scoped endpoints expose list_organizations + switch_organization, new /[owner]/tools/execute console + org settings page, Monaco on the reaction editor, new execute_invocation audit table.
Language decision was vetted by two independent second opinions (codex + pi), both picked TypeScript with the same two pivots: reactions are the real workload; agent fluency is an affordance problem.

Test plan

Read the plan end-to-end for scope/sequencing issues
Flag any edge case that should be BLOCKER instead of GUARD (or vice versa)
Confirm the 5-PR split matches your preferred land order
Sign off on the execute = write access-level change before PR-2 lands

No code changes; doc-only.

Addendum to docs/mcp-search-execute-design-doc.md covering (1) the client.org(slugOrId) cross-org accessor, (2) execute access-level flip from admin to write with per-call checks as the real boundary, (3) scoped-endpoint exposure of list/switch_organization (drop join_organization), (4) authoring affordances for LLM fluency, (5) a new /[owner]/tools/execute console + /[owner]/settings/organizations page + Monaco upgrade for the reaction editor, and (6) an execute_invocation audit table. Five-PR land order with validation.

chatgpt-codex-connector · 2026-04-24T16:00:58Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

…folding Plan content was originally on docs/mcp-multi-org-plan (PR #346) but that PR was overhead for a solo land — folding the doc into the same PR as its first implementation step. PRs 2–5 will reference this file.

buremba · 2026-04-24T21:56:25Z

Folded into PR #348 — separate plan PR was overhead for a solo land. Plan doc now lives at docs/plans/mcp-multi-org-and-execute.md on the implementation branch.

* feat(owletto-backend): ClientSDK + sandbox scaffolding for execute/search Lands PR-1 of docs/plans/mcp-multi-org-and-execute.md: the in-process SDK surface, per-namespace delegations, membership cache, and isolated-vm runner skeleton. No new MCP tools yet — execute and search land in PR-2. - `src/sandbox/client-sdk.ts` — buildClientSDK + .org() accessor with per-call membership re-validation. AccessDenied/OrgNotFound errors surface the exact org and reason. - `src/sandbox/membership-cache.ts` — small LRU (cap 128, TTL 30s) keyed on (userId, slug-or-id). Catches revocations within the TTL while keeping cross-org walks cheap. - `src/sandbox/namespaces/*.ts` — 11 namespaces (entities, entitySchema, connections, feeds, authProfiles, operations, watchers, classifiers, viewTemplates, knowledge, organizations). Thin delegations over existing admin handlers; per-call authz runs inside the handlers. - `src/sandbox/method-metadata.ts` — summary/access/example keyed by SDK path. Seeded for PR-1; PR-2 adds CI coverage check. - `src/sandbox/typebox-to-signature.ts` — TypeBox → inline TS signature formatter used by the `search` tool. - `src/sandbox/run-script.ts` — isolated-vm loader + limits shape. Native module is an optionalDependency so bun install succeeds on platforms without a prebuild; Dockerfile adds python3 + build-essential so production builds work. Tests: 25 unit tests (bun:test) cover membership cache semantics, metadata shape, typebox formatter, and run-script loader fallback. Integration test (vitest) for .org() exercises real org/member tables with AccessDenied/OrgNotFound/public-org/LRU-shortcircuit/revocation paths. * fix(owletto-backend): audit ClientSDK against handler schemas + consolidate cache Addresses PR-1 review from Pi and my own gap analysis. - Namespace payload alignment against real handler TypeBox schemas: * watchers.delete → watcher_ids[]; setReactionScript → reaction_script; completeWindow → window_token. watcher_id now string throughout. * entitySchema.* uses plain `slug`; addRule takes source/target_entity_type_slug + relationship_type_slug. * operations.execute.watcher_source.window_id is a number. * authProfiles keyed by slug, not numeric id; connections.connect takes auth_profile_slug; feeds.create requires feed_key. * viewTemplates.resource_id accepts string | number; classifiers uses classifier_id for CRUD, classifier_slug for classify. - Slug-vs-id resolution now tries slug first, falls back to id on miss. Drops the ID_RE regex that false-positived 20+ char slugs. - Consolidated on MultiTenantProvider's memberRoleCache (explicit invalidation on member-table writes). Separate sandbox LRU removed. Exposes getCachedMembershipRole / getCachedOrgBySlug / getOrgById. - client.query() drops the dead `params` arg — validateAndScopeQuery rejects user positional parameters and the old wrapper's merge was unsafe. Metadata example updated. - AccessDeniedError + OrgNotFoundError share a SdkError base. - method-metadata.ts covers every namespace method; the unit test asserts coverage so drift shows up in CI. - New integration test that dispatches real read-path methods through every namespace (catches the class of bug that static casts hid). - vitest.config.ts + migrations-dir walk-up so `OWLETTO_TEST_BACKEND=pglite` actually runs the integration suite. PGlite-incompatible handler queries (sql template-fragment composition, prepared-statement mismatches) run under real Postgres via a small pgOnlyIt helper. * fix(owletto-backend): remaining namespace contract drift + serial integration runs Second Pi review caught four more wrapper lies: - authProfiles.create/update now mirror the handler: `profile_kind` (not `auth_type`), `display_name` required on create, `credentials` + `auth_data` instead of a generic `config`. - viewTemplates.set exposes a single `json_template` (which may nest `data_sources`) instead of two separate fields; rollback takes `version` (a number) not `version_id`. - connections.updateConnectorAuth uses `auth_values` (the handler contract), not `auth_config`. - connections.installConnector drops the required `connector_key` — handler picks from `source_url` / `source_uri` / `source_code` / `mcp_url` and accepts optional `auth_values`. - Added app_auth_profile_slug + config to connections.{create, connect, update} for parity with the handler. vitest.config.ts: force singleFork pool so the two integration files can't stomp each other's fixtures via cleanupTestDatabase when run in parallel. * fix(owletto-backend): authProfiles.create uses `slug` (not `auth_profile_slug`) Third-pass Pi review caught the asymmetry: the handler's create_auth_profile action reads `args.slug` to set the new profile's stable identifier, while get/test/update/delete read `auth_profile_slug` to identify an existing profile. Wrapper now mirrors the handler verbatim: - create: takes optional `slug` + `display_name` + `profile_kind` + `credentials`/`auth_data`/`requested_scopes`. - update: takes `auth_profile_slug` (required, identifies the row) + all the mutable fields, including optional new `slug` for renames. - get/test/delete: positional `auth_profile_slug`. - list: adds connector_key/provider/profile_kind filters for parity. Without this fix, `client.authProfiles.create({ slug: 'primary' })` would silently auto-derive the slug from display_name, and follow-up get('primary') would miss. Typed contract now matches runtime. * chore(owletto-backend): drop unused env arg from organizations namespace Pi final-pass nit. Per the repo's convention (AGENTS.md §"When fixing unused-parameter errors, delete the parameter rather than prefixing with `_`"), drop `_env` from buildOrganizationsNamespace. Call site in client-sdk.ts adjusted. * docs(plans): land multi-org execute + search plan alongside PR-1 scaffolding Plan content was originally on docs/mcp-multi-org-plan (PR #346) but that PR was overhead for a solo land — folding the doc into the same PR as its first implementation step. PRs 2–5 will reference this file. * chore(owletto-backend): remove dead getCurrentOrgInfo helper No callers anywhere — was added speculatively for the search tool's preamble and the web console's header chip (PR-2/PR-5 territory). Drop it now; it'll come back when there's an actual call site to motivate its return shape. Also: tried collapsing the per-namespace XxxNamespace interfaces into ReturnType<typeof buildXxx> in client-sdk.ts to save ~250 lines, but TypeScript strict mode requires those interfaces to type the inline arrow params inside each builder. The interfaces are load-bearing, not ceremonial. Keeping them. * fix(deps): regenerate bun.lock with packages/owletto-web submodule initialized CI's `bun install --frozen-lockfile` step failed because the lockfile was committed without the submodule deps. Re-runs with the submodule checked out produce 543 added lock entries (mostly owletto-web's React/Tailwind tree). No source changes. * fix(owletto-backend): escape backslashes in typebox-to-signature literal output CodeQL high-severity flag on src/sandbox/typebox-to-signature.ts:104: formatLiteral escaped single quotes but not backslashes. A string like \`path\\to\\file\` would emit \`'path\\to\\file'\`, which is an invalid TS string literal (\\t is a tab, \\f is a form feed). A trailing backslash produced an unterminated literal \`'\\'\`. Fix order: escape backslashes first (so the \\\\ they become don't collide with the \\' produced by the next pass), then escape single quotes. Two regression tests cover backslash + quote paths. * feat(owletto-backend): wire execute + search MCP tools, swap reactions to ClientSDK PR-2 of docs/plans/mcp-multi-org-and-execute.md. Folded into the same PR as PR-1 (the SDK scaffolding) since neither delivers user-visible functionality alone. External MCP surface — what changes for clients ----------------------------------------------- - `execute(script)`: run TypeScript over the typed `ClientSDK`. The `client` global exposes namespaces (entities, watchers, knowledge, organizations, …) plus `client.org(slug)` for cross-org calls, `client.query(sql)` for read-only SQL, `client.log(...)`. Per-method authz fires inside the SDK; entry-point gate is write-tier. - `search(query)`: discover SDK methods. Namespace listing for "watchers", drill-down with signature/throws/example for "watchers.create". - `list_organizations` + `switch_organization`: now exposed on both unscoped /mcp and scoped /mcp/{slug} endpoints. - Removed from external surface: the 14 `manage_*` tools, `list_watchers`, `get_watcher`, `read_knowledge`, `join_organization`. Their handlers are kept and re-registered as `internal: true` so the in-process test harness can still dispatch them by name without forcing every test to migrate to script-based calls. Production agents go through the SDK. Implementation -------------- - `src/sandbox/run-script.ts`: replaces the PR-1 stub with the real `isolated-vm` bridge. Compiles TS via the existing esbuild pipeline, builds a single async dispatch reference, reconstructs `client` guest- side via Proxy, captures structured logs + return value, enforces memory/CPU/quota/output limits, maps errors to TimeoutError / QuotaExceeded / OutOfMemory / ScriptError / CompileError. - `src/sandbox/method-metadata.ts` is the data source for `search`. - `src/types/isolated-vm.d.ts`: minimal type shim so TS compiles when the optional native module isn't installed (Node 25 etc.). Production Docker installs the real module. - `src/watchers/reaction-executor.ts`: dropped QuickJS + the flat ReactionSDK env-bridge. Reactions now go through `buildClientSDK` + `runScript`. Stored reaction scripts that export `react(ctx, sdk)` keep working — the runner accepts both `default` and `react` exports. - Dropped deps: `@sebastianwessel/quickjs`, `@jitl/quickjs-ng-wasmfile-release-asyncify`. - `src/tools/execute.ts`: dropped scopedToOrg gating on org tools, dropped the `join_organization` carve-out. - `src/auth/tool-access.ts`: kept the per-action policy table — it applies to internal-tool dispatches the same way it did to external. Added `execute` (write), `search` (public-readable). Test status ----------- - Unit (bun:test): 27/27 sandbox + 32/32 tool-access pass. - Integration: tests that go through the test harness against legacy `manage_*` names keep working via the `internal: true` registration. Other long-standing integration failures (~190) are pre-existing pglite/worktree environment issues, not from this PR. CI will run against real Postgres. - Removed: `__tests__/integration/mcp/public-org-join.test.ts` — exclusively tested the dropped `join_organization` MCP tool. * fix(owletto-backend): drop legacy MCP registrations + reaction `react` fallback User direction: the new MCP surface should be consistent everywhere; no back-compat needed since stored data (reaction scripts, agent prompts) will be migrated manually in production. - Removed `internal: true` re-registrations of the 14 manage_* tools + list_watchers / get_watcher / read_knowledge. Deleted `src/tools/admin/index.ts` (the ADMIN_TOOLS aggregator) entirely. - Removed the `module.exports.react` fallback in run-script.ts. Stored reaction scripts must export `default async (ctx, client, params?)`; legacy ones need a one-time DB migration. - Updated workspace-instructions to point agents at `execute` / `client.<ns>.<method>` patterns instead of legacy tool names. - Updated openclaw plugin's "already authenticated" hint text. - auth.test.ts: removed the "hides org switching on scoped routes" assertion (org tools are now exposed everywhere), updated tool-list assertion to expect the new clean surface. - tool-access.test.ts: replaced one assertion that referenced manage_connections (now non-existent) with query_sql which is the canonical admin-only tool on the new surface. Frontend (`packages/owletto-web`) ships ~50+ `apiCall('manage_*', ...)` calls through REST — those break with this change. Tracked as a separate submodule PR to migrate them to `apiCall('execute', { script: ... })`. Integration tests that still use `mcpToolsCall('manage_*', ...)` will fail with `Tool not found` — those require per-test migration to either direct handler imports or `execute`-script form, which is out of scope for this PR. The single-test-file migration pattern is intentional so each test owner can review the new SDK call shape. * fix(owletto-backend): address self + Pi review findings on PR #348 Self-review and Pi senior-engineer pass surfaced a privilege-escalation gap and a handful of stale-after-cold-drop issues. Fixes here only — the deploy-time prerequisites (reaction-script DB migration, owletto-web caller migration) remain tracked as follow-ups in the PR description. Security - `client.query()` now enforces owner/admin role to match the `query_sql` MCP tool. Without this, write-tier users could bypass the admin gate by calling `execute({script: "client.query(...)"})` and read audit/event tables exposed by the validated allowlist. - `list_organizations` / `switch_organization` now require the `mcp:read` scope; previously the org-agnostic branch returned before the scope check, letting profile-only OAuth tokens call them. - `execute` MCP tool annotated `destructiveHint: true` so host clients prompt for approval — scripts can delete entities, drop watchers, and trigger external operations. Sandbox hardening (run-script.ts) - Guest-side proxy short-circuits `then`, `catch`, `finally`, Symbol keys, `constructor`, `__proto__`, `inspect` — accidental `await client.foo` no longer turns into a host SDK call against `foo.then`. - Host dispatcher now requires the namespace and method to be own enumerable keys of the resolved SDK target. Belt-and-braces with the guest-side filter; closes the avenue where `client.constructor.assign` would resolve to `Object.assign` on the host. - Script return value (`returnJson`) is now charged to `outputBytes`, so scripts can't bypass the 256KB output cap by returning a large object. Stale references after the cold-drop - reaction-executor.ts header no longer claims it accepts the legacy `react` export — that fallback was removed in 9dfd372. - tools/search.ts no-match suggestion text and the `search_knowledge` module header now point at `execute` + `client.<ns>.<method>` patterns instead of the removed `manage_*` MCP names. - auth.test.ts assertions for public-org tools/list dropped the now-stale expectations on `manage_entity` being externally listed. - run-script.test.ts no longer asserts the runner is a stub; the bridge is real now. Test gracefully skips on systems without `isolated-vm` installed. * chore(docs): align stragglers with the post-cold-drop MCP surface Sweep prompted by "are all docs everywhere consistent?" — turns out no: several docs and a couple of dead type exports were still describing the legacy `manage_*` / `ReactionSDK` world. Surgical fixes only; the owletto-cli runtime callers (separate scope) are added to the PR's follow-up section, not patched here. - packages/owletto-sdk: drop the dead `ReactionSDK` and `ReactionHandler` type exports — only `ReactionContext` (still imported by the new reaction-executor) and `ReactionEntity` remain. The runtime SDK that reactions actually call is `ClientSDK`, defined in owletto-backend. - packages/owletto-backend/src/auth/tool-access.ts: clarify the comment next to the `manage_entity` policy entry. The handlers are no longer registered as `internal: true` MCP tools — they're unregistered, but the policy table still drives `routeAction`'s per-action checks when SDK wrappers reach them from inside `execute`. - docs/plans/mcp-multi-org-and-execute.md: the "Existing reaction scripts in DB" risk bullet claimed `ClientSDK` keeps the same method paths and entry-point shape as `ReactionSDK`. That's wrong (`content. save` → `knowledge.save`, `actions.execute` → `operations.execute`, no `notify`, new `default async (ctx, client, params?)` shape). Replace with the actual mapping plus the audit query that confirms the migration cleared the legacy patterns. - packages/landing/.../owletto-cli.md: the `owletto run` examples and "available tools" list still pointed at `read_knowledge`, `list_ watchers`, `get_watcher`, and the 14 `manage_*` names. Replace with the current external surface: `search_knowledge`, `save_knowledge`, `search`, `execute`, `query_sql`, `list_organizations`, `switch_ organization`, plus an `execute` script example. - packages/owletto-connectors/src/README.md: manual connector install instruction pointed at `manage_connections` — now points at `client.connections.installConnector(...)` from inside `execute` (or the equivalent admin REST endpoint). * fix(owletto-backend): enforce SDK action access * fix(ci): run owletto auth test with bun * fix(ci): build owletto sdk before backend tests * fix(owletto-backend): apply pi second-pass review findings Pi's second-pass review caught two real things to address before merge: 1. Sandbox guest proxy reserved-key list missed coercion probes. `JSON.stringify(client.entities)` and `console.log(client.entities)` would call `toJSON` / `toString` / `valueOf` on the proxy, get back a dispatcher closure, and end up firing a host SDK call against `entities.toJSON` (etc) — wasting quota and surfacing a confusing `Unknown SDK method` error. Add those three keys to `__isReservedKey()` alongside the existing `then`/Symbol/proto short-circuits in run-script.ts. 2. `compileReactionScript` and `runScript`'s internal recompile used different esbuild `external` lists. Reactions externalized `@owletto/reactions`, which the runtime recompile (with `external: []`) would then fail to resolve — save-time would accept code the runtime could not run. Align reactions to `external: []` so save-time and runtime accept the same set of imports. None of the migrated reactions in `_reactions_backup_2026_04_25` import `@owletto/reactions`, so this is a no-op for current production data. * fix(owletto-backend): preserve internal admin REST tools * fix(owletto-backend): keep manage_connections + manage_auth_profiles on MCP for CLI The owletto-cli browser-auth flow drives connection setup and credential exchange through MCP RPC (`packages/owletto-cli/src/commands/browser-auth.ts`). After flipping all admin tools to `internal: true` those calls fail with "Tool not found" because internal tools are filtered out of the MCP path (see tools/execute.ts:85). Re-expose just the two tools the CLI actively needs. The other admin tools used by the CLI (`seed.ts`) hit the REST endpoint `/api/{slug}/{toolName}` which bypasses the internal-flag filter, so those don't need this treatment. Once a CLI migration PR ports browser-auth to execute + client.* the internal flag can be re-applied.

buremba mentioned this pull request Apr 24, 2026

feat(owletto-backend): multi-org execute + search MCP tools #348

Merged

buremba closed this Apr 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(plans): multi-org execute + search MCP tool plan#346

docs(plans): multi-org execute + search MCP tool plan#346
buremba wants to merge 1 commit into
mainfrom
docs/mcp-multi-org-plan

buremba commented Apr 24, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 24, 2026

Uh oh!

buremba commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

buremba commented Apr 24, 2026

Summary

Test plan

Uh oh!

chatgpt-codex-connector Bot commented Apr 24, 2026

Uh oh!

buremba commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant