feat(world-model): cross-org relationship_types + catalog discovery#377
Conversation
…n search Closes the two BLOCKER gaps pi flagged after #374: 1. **Schema search path for entity_relationship_types** (`tools/admin/manage_entity.ts::handleLink`). Mirrors what #374 did for entity_types: tenant first, then any `visibility='public'` org. Tenant-local relationship types still win. Without this, even though entities can use public-catalog vocabulary, relationships couldn't — e.g. a tenant relating their `\$member` to a canonical Apple Inc would have to register a local copy of `works_at`. 2. **Public-catalog discovery in `tools/search.ts`**. Adds an `include_public_catalogs` arg (defaults to true) so tenant agents can find canonical entities (HMRC, banks, currencies, …) by name/type without knowing entity ids upfront. Result rows already carry `organization_id`, so the agent can tell tenant-local from canonical hits. `fetchEntityById` widens the same way so an entity_id lookup following a search hit resolves cleanly. No DB migration. Tests: - `tools/__tests__/search-cross-org.test.ts` (3): public+tenant in one call; flag=false hides public; private orgs not snooped - `entity-relationships.test.ts`: tenant uses a `works-at-public` relationship_type defined in a public catalog org
|
Triage decision: Reasons:
Next: Address the P1 security issue raised in the review comment before proceeding with merge |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0d7407e843
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (includePublic) { | ||
| const orgParamIdx = addParam(organizationId); | ||
| conditions.push( | ||
| `(e.organization_id = $${orgParamIdx} OR EXISTS (SELECT 1 FROM organization o WHERE o.id = e.organization_id AND o.visibility = 'public'))` |
There was a problem hiding this comment.
Prevent cross-tenant config leak from public entity search
Expanding the search scope here to all visibility='public' entities means a public entity can become the primary hit, after which formatEntityResult calls fetchConnectionsForEntity (default include_connections=true) and returns c.config without scoping by caller org. In any environment where multiple tenants attach feeds to the same canonical public entity, this allows one tenant’s search call to read other tenants’ connection metadata/config for that entity. Please restrict connection hydration to ctx.organizationId (or skip it for non-caller-org entities) when public results are enabled.
Useful? React with 👍 / 👎.
Pi flagged two BLOCKERS in the previous round: 1. **Connection metadata leak.** `formatEntityResult` calls `fetchConnectionsForEntity(primaryEntity.id)` with no caller-org scope. For a public-catalog entity referenced by multiple tenants, any tenant searching that entity would receive other tenants' connection display names, configs, and feed entity names. Now skipped when the primary entity is in a different org from the caller — connections are tenant operational data, never canonical. 2. **Cross-tenant stat side channel.** Count subqueries in the SELECT (content_count, connection_count, watcher_count, children_count) computed globally for the entity id; for public-catalog entities referenced from many tenants, this leaks aggregate activity volumes. Now gated `CASE WHEN e.organization_id = $callerOrg THEN ... ELSE 0 END` for each count, so cross-org rows return zeros for operational stats. Children query also scoped to primary's own org. Also addressing IMPORTANT #3: tenant-local results were getting pushed out by high-scoring public matches. ORDER BY now `(e.organization_id = $caller) DESC, match_score DESC` so caller-org wins ties.
…rimaries Pi follow-up: maintains the 'operational counts are zero for cross-org' invariant consistently — children of a public-catalog primary now show content_count=0 to match the primary's own zeroed stats.
Captures the remaining gaps as a handoff for the next session. Each item states the simplest path that reuses existing surfaces — no new pages, no new MCP tools, no new components. Type pickers grow a badge column; read sites widen with the same pattern as #377; visibility flips guarded by a single existence check rather than a retroactive validation pass. Ordered by what unblocks usage: 1. Schema CRUD cross-org awareness (backend prerequisite) 2. Frontend type pickers show cross-org with read-only badge 3. Discovery via existing org dropdown (split into two groups) 4. Read-side cross-org tolerance (manage_entity.get, resolve_path) 5. Visibility-flip safety (refuse if referenced) 6. Catalog curation (data prune, no PR needed) 7. is_catalog flag (deferred — no user path mutates visibility today) 8. entity_relationship_type_rules slug → id (deferred — no conflict yet) 9. Contribution flow (premature — no content to contribute yet) 10. TOCTOU (deferred — semantic-not-corrupting microsecond window)
* feat(world-model): cross-org schema CRUD + read-side tolerance Closes the tenant-facing surface that consumes the agent-side cross-org plumbing landed in #374/#377. Items #1, #4 from docs/plans/world-model.md "Outstanding work"; #3, #5 collapse to doc-only. - manage_entity_schema list/get widen to (caller_org OR visibility=public) with tenant-first ORDER BY; rows now carry organization_slug. Same pattern used in entity-management.ts:249-260 resolver. - resolve_path widens both intermediate and leaf entity lookups so a tenant path can traverse into a public-catalog entity referenced via a cross-org relationship. - getEntity widens the read; comment already promised "own org or public". - Re-key entity_count helpers from slug to entity_type_id so cross-org slug collisions don't merge counts across rows. - Item #3 noted as already shipped (organization-dropdown.tsx already splits Your Organizations / Public Organizations with a separator). - Item #5 deferred — no exposed updateOrganization mutation today; the guard SQL is preserved inline for the future implementer. * docs(world-model): item #6 first-pass changelog Pruned classification-test-brand (id=45) from market-intelligence. Held back the $member rows (real membership, not cruft) and the template-seed verticals (need user call before pruning whole orgs). * fix(world-model): gate operational counts + $member ACL after cross-org widening Pi review of #386 flagged three real regressions introduced by the cross-org read widening. Fixes: - getEntity: scope total_content / active_connections / watchers_count / children_count by caller org. When `e` is a public-catalog row, totals now reflect the caller's references to it, never aggregate cross-tenant activity. - resolve_path leaf: same scoping for total_content (events) and watchers_count. - Exclude $member from public-catalog fallback in getEntity, resolve_path intermediate, and resolve_path leaf. Member-redaction uses ctx.memberRole (caller's workspace role), so a tenant admin/owner could otherwise read a public catalog's $member email by virtue of being admin of their own org. $member rows are per-tenant by design. - rtHandleList relationship_count: scope by caller's organization so public relationship-type rows don't expose global usage volume. Pre-existing concerns flagged in review but out of scope for this PR (documented for follow-up): resolve_path bootstrap entity-type counts (unscoped + missing deleted_at), schema get's slug ambiguity across multiple public catalogs, requireRelationshipType denying list_rules on public RTs.
Closes the two BLOCKER gaps pi flagged in the post-#374 review.
Why
After #374, entities could resolve types from public catalogs but two paths were still org-local:
This PR fixes both with the same pattern #374 introduced for entity_types.
What changes
New test coverage:
Out of scope (still deferred)
Test plan