Skip to content

feat(world-model): cross-org relationship_types + catalog discovery#377

Merged
buremba merged 3 commits into
mainfrom
feat/cross-org-followup
Apr 26, 2026
Merged

feat(world-model): cross-org relationship_types + catalog discovery#377
buremba merged 3 commits into
mainfrom
feat/cross-org-followup

Conversation

@buremba
Copy link
Copy Markdown
Member

@buremba buremba commented Apr 26, 2026

Closes the two BLOCKER gaps pi flagged in the post-#374 review.

Why

After #374, entities could resolve types from public catalogs but two paths were still org-local:

  1. Relationship types — when a tenant agent calls `manage_entity action='link'` to link to a canonical entity, the relationship_type slug was looked up in the caller's org only. So linking tenant → HMRC required `works_at` to exist locally. Defeats the public-catalog story for relationships.
  2. Discovery — `search` filtered to caller's org, so an agent had no way to find HMRC by name. They needed the entity_id upfront.

This PR fixes both with the same pattern #374 introduced for entity_types.

What changes

`manage_entity.ts::handleLink` relationship_type lookup Widens to tenant + visibility='public' orgs, ORDER BY tenant-first
`search.ts::queryEntities` org filter When `include_public_catalogs` is true (default), include public-catalog entities. Result rows already carry `organization_id`
`search.ts::fetchEntityById` Mirrors the relax — id lookup resolves canonical entities the agent discovered via search

New test coverage:

  • `tools/tests/search-cross-org.test.ts` — 3 cases (public+tenant in one call, flag=false hides public, private orgs not snooped)
  • `entity-relationships.test.ts` — tenant uses a `works-at-public` relationship_type defined in a public-catalog org

Out of scope (still deferred)

Test plan

  • `bunx tsc --noEmit` clean
  • `make build-packages` clean
  • `bun run check` clean
  • CI tests

…n search

Closes the two BLOCKER gaps pi flagged after #374:

1. **Schema search path for entity_relationship_types** (`tools/admin/manage_entity.ts::handleLink`).
   Mirrors what #374 did for entity_types: tenant first, then any
   `visibility='public'` org. Tenant-local relationship types still win.
   Without this, even though entities can use public-catalog vocabulary,
   relationships couldn't — e.g. a tenant relating their `\$member` to a
   canonical Apple Inc would have to register a local copy of `works_at`.

2. **Public-catalog discovery in `tools/search.ts`**. Adds an
   `include_public_catalogs` arg (defaults to true) so tenant agents can
   find canonical entities (HMRC, banks, currencies, …) by name/type
   without knowing entity ids upfront. Result rows already carry
   `organization_id`, so the agent can tell tenant-local from canonical
   hits. `fetchEntityById` widens the same way so an entity_id lookup
   following a search hit resolves cleanly.

No DB migration. Tests:
- `tools/__tests__/search-cross-org.test.ts` (3): public+tenant in one
  call; flag=false hides public; private orgs not snooped
- `entity-relationships.test.ts`: tenant uses a `works-at-public`
  relationship_type defined in a public catalog org
@github-actions github-actions Bot added the triage:needs-human Triage agent escalated for human review label Apr 26, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 26, 2026

Triage decision: needs-human

Reasons:

  • P1 security issue flagged in review comment about cross-tenant config leak
  • Review comment contains escalation keyword "P1"
  • Security concern about potential data leak between tenants requires human attention

Next: Address the P1 security issue raised in the review comment before proceeding with merge

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0d7407e843

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

if (includePublic) {
const orgParamIdx = addParam(organizationId);
conditions.push(
`(e.organization_id = $${orgParamIdx} OR EXISTS (SELECT 1 FROM organization o WHERE o.id = e.organization_id AND o.visibility = 'public'))`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Prevent cross-tenant config leak from public entity search

Expanding the search scope here to all visibility='public' entities means a public entity can become the primary hit, after which formatEntityResult calls fetchConnectionsForEntity (default include_connections=true) and returns c.config without scoping by caller org. In any environment where multiple tenants attach feeds to the same canonical public entity, this allows one tenant’s search call to read other tenants’ connection metadata/config for that entity. Please restrict connection hydration to ctx.organizationId (or skip it for non-caller-org entities) when public results are enabled.

Useful? React with 👍 / 👎.

buremba added 2 commits April 26, 2026 23:50
Pi flagged two BLOCKERS in the previous round:

1. **Connection metadata leak.** `formatEntityResult` calls
   `fetchConnectionsForEntity(primaryEntity.id)` with no caller-org
   scope. For a public-catalog entity referenced by multiple tenants,
   any tenant searching that entity would receive other tenants'
   connection display names, configs, and feed entity names. Now
   skipped when the primary entity is in a different org from the
   caller — connections are tenant operational data, never canonical.

2. **Cross-tenant stat side channel.** Count subqueries in the SELECT
   (content_count, connection_count, watcher_count, children_count)
   computed globally for the entity id; for public-catalog entities
   referenced from many tenants, this leaks aggregate activity volumes.
   Now gated `CASE WHEN e.organization_id = $callerOrg THEN ... ELSE 0
   END` for each count, so cross-org rows return zeros for operational
   stats. Children query also scoped to primary's own org.

Also addressing IMPORTANT #3: tenant-local results were getting pushed
out by high-scoring public matches. ORDER BY now `(e.organization_id =
$caller) DESC, match_score DESC` so caller-org wins ties.
…rimaries

Pi follow-up: maintains the 'operational counts are zero for cross-org'
invariant consistently — children of a public-catalog primary now show
content_count=0 to match the primary's own zeroed stats.
@buremba buremba added the skip-size-check Bypass PR size gate for intentionally large single-concern changes label Apr 26, 2026
@buremba buremba merged commit bfb7dfb into main Apr 26, 2026
13 checks passed
@buremba buremba deleted the feat/cross-org-followup branch April 26, 2026 22:53
buremba added a commit that referenced this pull request Apr 26, 2026
Captures the remaining gaps as a handoff for the next session. Each item
states the simplest path that reuses existing surfaces — no new pages,
no new MCP tools, no new components. Type pickers grow a badge column;
read sites widen with the same pattern as #377; visibility flips guarded
by a single existence check rather than a retroactive validation pass.

Ordered by what unblocks usage:
1. Schema CRUD cross-org awareness (backend prerequisite)
2. Frontend type pickers show cross-org with read-only badge
3. Discovery via existing org dropdown (split into two groups)
4. Read-side cross-org tolerance (manage_entity.get, resolve_path)
5. Visibility-flip safety (refuse if referenced)
6. Catalog curation (data prune, no PR needed)
7. is_catalog flag (deferred — no user path mutates visibility today)
8. entity_relationship_type_rules slug → id (deferred — no conflict yet)
9. Contribution flow (premature — no content to contribute yet)
10. TOCTOU (deferred — semantic-not-corrupting microsecond window)
buremba added a commit that referenced this pull request Apr 27, 2026
* feat(world-model): cross-org schema CRUD + read-side tolerance

Closes the tenant-facing surface that consumes the agent-side cross-org
plumbing landed in #374/#377. Items #1, #4 from
docs/plans/world-model.md "Outstanding work"; #3, #5 collapse to doc-only.

- manage_entity_schema list/get widen to (caller_org OR visibility=public)
  with tenant-first ORDER BY; rows now carry organization_slug. Same
  pattern used in entity-management.ts:249-260 resolver.
- resolve_path widens both intermediate and leaf entity lookups so a
  tenant path can traverse into a public-catalog entity referenced via
  a cross-org relationship.
- getEntity widens the read; comment already promised "own org or public".
- Re-key entity_count helpers from slug to entity_type_id so cross-org
  slug collisions don't merge counts across rows.
- Item #3 noted as already shipped (organization-dropdown.tsx already
  splits Your Organizations / Public Organizations with a separator).
- Item #5 deferred — no exposed updateOrganization mutation today; the
  guard SQL is preserved inline for the future implementer.

* docs(world-model): item #6 first-pass changelog

Pruned classification-test-brand (id=45) from market-intelligence.
Held back the $member rows (real membership, not cruft) and the
template-seed verticals (need user call before pruning whole orgs).

* fix(world-model): gate operational counts + $member ACL after cross-org widening

Pi review of #386 flagged three real regressions introduced by the
cross-org read widening. Fixes:

- getEntity: scope total_content / active_connections / watchers_count /
  children_count by caller org. When `e` is a public-catalog row, totals
  now reflect the caller's references to it, never aggregate cross-tenant
  activity.
- resolve_path leaf: same scoping for total_content (events) and
  watchers_count.
- Exclude $member from public-catalog fallback in getEntity, resolve_path
  intermediate, and resolve_path leaf. Member-redaction uses
  ctx.memberRole (caller's workspace role), so a tenant admin/owner could
  otherwise read a public catalog's $member email by virtue of being
  admin of their own org. $member rows are per-tenant by design.
- rtHandleList relationship_count: scope by caller's organization so
  public relationship-type rows don't expose global usage volume.

Pre-existing concerns flagged in review but out of scope for this PR
(documented for follow-up): resolve_path bootstrap entity-type counts
(unscoped + missing deleted_at), schema get's slug ambiguity across
multiple public catalogs, requireRelationshipType denying list_rules on
public RTs.
@buremba buremba restored the feat/cross-org-followup branch May 12, 2026 00:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip-size-check Bypass PR size gate for intentionally large single-concern changes triage:needs-human Triage agent escalated for human review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant