feat(db)!: convert entities.entity_type to entity_type_id FK#370
Conversation
Replaces the text-slug column on entities with an integer FK on entity_types(id). Two motivations folded into one change: 1. Integrity. The previous text-slug pattern silently orphaned entities when their entity_type was renamed or hard-deleted; only the create- path validator enforced same-org existence, and updates/deletes bypassed it. The FK now does that work at the DB layer: renames are safe (slug becomes display-only via JOIN), Postgres refuses to drop a referenced type. 2. Cross-org vocabulary. entity_types.id is globally unique, so an entity in tenant org A can carry a type defined in public-catalog org B by FK alone — without an extra org_id column on entities. Sets up the world-model plan in docs/plans/world-model.md for slim install and public-catalog references in subsequent PRs. Migration: db/migrations/20260426120000_entities_entity_type_fk.sql. Single-prod-DB: adds nullable column, backfills from existing slug lookup, fails loudly on orphaned rows, sets NOT NULL, drops the text column. RUN MANUALLY before deploying. All read sites switched to "JOIN entity_types et ON et.id = e.entity_type_id" and select "et.slug AS entity_type". INSERT sites resolve slug → id first. Test fixtures updated to the same pattern. The cross-org write guard for entity_relationships is deferred to the PR that introduces cross-org INSERTs (slim install + agent_template entity); adding it now would be dead code.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f488ff547b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| FROM public.entity_types et | ||
| WHERE et.slug = e.entity_type | ||
| AND et.organization_id = e.organization_id |
There was a problem hiding this comment.
Disambiguate entity-type backfill before setting FK IDs
This backfill join is non-deterministic when an org has multiple entity_types rows with the same slug (which is possible because uniqueness is only enforced for deleted_at IS NULL). In that case, one entities row can match multiple et rows and PostgreSQL may assign an arbitrary et.id, so entities can be rebound to a stale/deleted type after migration. Make the mapping deterministic (for example, prefer the active row, or otherwise fail when multiple matches exist) before writing entity_type_id.
Useful? React with 👍 / 👎.
| LEFT JOIN entities e | ||
| ON e.organization_id = et.organization_id | ||
| AND e.entity_type = et.slug | ||
| ON e.entity_type_id = et.id | ||
| WHERE et.deleted_at IS NULL |
There was a problem hiding this comment.
Scope entity-type counts to the current organization
This join now counts every entity that references the type ID, regardless of which organization owns the entity. Once cross-org type references exist, entity_count in resolve_path bootstrap will include other orgs’ entities and leak/overstate counts for the current workspace. The previous logic constrained counts to same-org entities; add an e.organization_id = ${organizationId} condition here to preserve org-local semantics.
Useful? React with 👍 / 👎.
…c backfill Pi review caught three SELECTs that referenced the (now-dropped) entity_type text column and would crash post-migration: - tools/admin/manage_entity.ts:545 — handleUpdate fetched before-state - tools/search.ts:405 — fetchTopEntitiesByType - utils/workspace-instructions.ts:25 — entity-type counts Each rewritten to JOIN entity_types and select et.slug AS entity_type. Backfill in the migration was non-deterministic: the (slug, organization_id) pair has multiple matches when an active entity_types row coexists with a soft-deleted predecessor, since the UNIQUE index only covers `deleted_at IS NULL`. Switched to a correlated subquery ordered by `(deleted_at IS NULL) DESC, id DESC` so live rows always win, soft-deleted rows are the fallback for history preservation.
|
Triage decision: Reasons:
Next: Human review required for P1 database migration issues and manual deployment coordination. Review inline comments for specific migration concerns. |
DROP COLUMN entity_type would silently cascade-drop entities_organization_id_entity_type_slug_parent_id_key. Make the drop explicit, and document why we don't recreate it: the stronger entities_slug_parent_unique (UNIQUE on org_id, COALESCE(parent_id, 0), slug) already enforces slug uniqueness within (org, parent) regardless of entity type, with NULL-parent collapsing — so the entity-type-keyed constraint never caught anything the index didn't already catch. Down migration recreates the constraint and the column comment for rollback fidelity. Caught by pi review v2.
Tests routinely create entities in fresh orgs (e.g. createTestOrganization followed by createTestEntity) without first calling seedSystemEntityTypes. Pre-FK that worked because the slug-text column accepted any string; the new FK requires an entity_types row to exist. Either seed in every test setup or upsert here — the latter is one-time-cheap and keeps existing test code working. Caught by pi review v3.
dbmate auto-regenerated after applying 20260426120000_entities_entity_type_fk.sql to prod. Reflects: - entities.entity_type column dropped - entities.entity_type_id integer NOT NULL FK on entity_types(id) - idx_entities_entity_type_id index - entities_organization_id_entity_type_slug_parent_id_key constraint dropped (redundant with entities_slug_parent_unique)
|
Triage decision: Reasons:
Next: Human review required for P1 database migration issues and manual deployment coordination. Review inline comments for specific migration concerns including non-deterministic backfill handling and cross-org entity count scoping. |
…guard (#374) * feat(world-model): cross-org references — schema search path + write guard Two small changes that exercise the FK foundation #370 set up: 1. **Schema search path in createEntity / entity-link-upsert.** When an entity_type slug isn't registered in the entity's own org, fall back to any org with `organization.visibility = 'public'`. First match wins, tenant-local types preferred when both exist. The resolved `entity_type_id` is materialized on the entity row, so reads never need to repeat the search. Lets a tenant agent write a `tax_filing` entity whose type lives in `public-uk-tax`, no per-tenant cloning. 2. **Cross-org write guard in validateScopeRule.** Relationship targets may now be in a different org *if* that org is a public catalog (`visibility = 'public'`). Sources still must be in the caller's org. Public → tenant references stay forbidden. The relationship row's organization_id remains the source's, keeping the assertion under the caller's control. Lets a tenant relationship point at a canonical entity like HMRC or Barclays without copying it locally. No schema migration; both changes piggyback on the global FKs already in place (#370 for `entity_types.id`, baseline for `entities.id`). Tests: - `entity-management-schema-search.test.ts` — 4 unit tests (fall-through, tenant-local-wins, unknown-type-rejected, private-org-not-snooped) - `entity-relationships.test.ts` — adds positive case for cross-org link to a public-catalog entity * fix(cross-org-refs): address pi review 1. Document the slug-poisoning caveat — visibility='public' alone is trusted today; long-term we'll narrow with `is_catalog` or per-agent `uses_catalog`. Operationally we restrict visibility flips to admins. 2. Add unit test for the entity-link-upsert resolver path so future drift between createEntity and entity-link-upsert is caught. 3. Add the missing negative case in entity-relationships.test.ts: source entity in a different org from the caller is rejected (sources must always be in the caller's org). 4. Comment typo (et alias). TOCTOU between type lookup and INSERT (pi finding 2) noted but deferred — the window is microseconds, the failure mode is semantic-not-corruption, and the cleanest fix is a transactional rewrite of createEntity that's out of scope for this PR. * fix(cross-org-refs): preserve caller-from for cross-org symmetric edges Pi flagged: validateScopeRule checks `from_entity_id` before canonicalization, but `canonicalizeSymmetricEdge` may swap a public-catalog target into the stored `from_entity_id` slot when its numeric id is lower. The row stays under the caller's org (so it's tenant-owned) but the stored source ends up cosmetically inverted vs the documented invariant ("source must always be the caller's org"). Fix: canonicalize symmetric edges only when both endpoints are in the caller's org. Cross-org symmetric edges keep caller-from / public-to as provided. Same-org symmetric still canonicalizes by id so dedup catches a→b and b→a as the same edge. Cross-org dedup is unaffected because validateScopeRule already forbids a `public → tenant` create direction.
Why
Two problems collapse to the same fix.
entities.entity_typewas a free-floating text slug. Renaming or hard-deleting anentity_typesrow silently orphaned every referencing entity — joins went empty, the validator only protected new inserts. With a real FK onentity_types(id), Postgres refuses to drop a referenced type and renames update for free (slug is display-only via JOIN).entity_types.idis globally unique (one sequence across all orgs). An entity in tenant org A can carry a type defined in public-catalog org B by FK alone — noentity_type_org_idcolumn needed. This is the foundation for the world-model plan indocs/plans/world-model.md.What changes
db/migrations/20260426120000_entities_entity_type_fk.sql: add nullableentity_type_idFK, backfill (liveentity_typespreferred, soft-deleted as fallback), fail loudly on orphans, set NOT NULL, drop the text column.JOIN entity_types et ON et.id = e.entity_type_idand selectet.slug AS entity_type. Parent-entity joins usepet; relationship from/to usefet/tet.entity_type_id.execute-data-sources.tspreserves the publicentities.entity_typecontract for user-suppliedquery_sqlby aliasinget.slug AS entity_typein the entities CTE.Out of scope (intentional)
This PR is same-org-only. INSERT-side validators still resolve
entity_typeslugs against the entity's own org (matching today's behaviour). The cross-org capability is enabled by the FK —entities.entity_type_idcan already point at any org'sentity_typesrow — but the resolution path (schema search path through an agent'suses_catalogdeclarations, plus the cross-org write guard forentity_relationships) is left to the next PR (slim install +agent_templateentity). Adding it here without an exercising path would be dead code.entity_relationships.relationship_type_idis already an integer FK — no change needed.Manual deploy procedure
The migration adds and drops in one transaction; new code requires the new column. To avoid serving traffic against a half-migrated DB:
If the migration aborts on orphans, surface the offenders before retrying:
Review (pi pass)
Pi reviewed the diff and flagged three missed SQL sites (
manage_entity.ts:545,search.ts:405,workspace-instructions.ts:25) — fixed in01c23330. Same commit also tightens the migration backfill against soft-deleted/active slug collisions ((deleted_at IS NULL) DESC, id DESCordering preserves history while resolving deterministically).Test plan
bunx tsc -p packages/owletto-backend --noEmitclean (only pre-existingagents/install.ts:799errors).make build-packagesclean.bun run check(Biome) clean.