Add multi-tenant schema: owner_id + users table#89
Conversation
Alembic migration adds owner_id (NOT NULL with backfill) to entries, reads, actions, and embeddings tables. Creates users table with email (+ canonical normalization), E.164 phone, argon2id password_hash, timezone, and preferences JSONB. Existing data is backfilled from AWARENESS_DEFAULT_OWNER env var (falls back to system username). Fresh installs use the same default via column DEFAULT clauses. ON CONFLICT clause updated to match new unique index (owner_id, source, logical_key). No query logic changes — owner_id exists in schema but is not yet used in WHERE clauses. That comes in the next PR. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
cmeans
left a comment
There was a problem hiding this comment.
QA Review — PR #89: Add multi-tenant schema
Summary
Code review complete, automated tests pass (383/383 pytest, ruff clean, mypy clean, CI green). Manual testing blocked by blocker #1 — Docker fresh install crashes on seed data due to missing column DEFAULT.
Findings
1. BLOCKER: Alembic migration missing SET DEFAULT on owner_id — Docker fresh install broken
The migration adds owner_id NOT NULL to all 4 tables but never calls ALTER COLUMN owner_id SET DEFAULT. Meanwhile, _create_tables() DDL correctly has DEFAULT '{_escaped_default_owner}' on all 4 tables.
The Docker entrypoint runs: Alembic → seed_demo.py → server. On a fresh database, Alembic creates the schema (no DEFAULT), then seed-demo.sql tries to INSERT without owner_id → NotNullViolation.
Reproduced:
psycopg.errors.NotNullViolation: null value in column "owner_id" of relation "entries"
violates not-null constraint
The PR summary says "column DEFAULTs for backward compatibility" but this only holds for the _create_tables() code path, not the Alembic path.
Fix: Add ALTER TABLE <table> ALTER COLUMN owner_id SET DEFAULT '{DEFAULT_OWNER}' for all 4 tables in the migration, after the SET NOT NULL step. This also makes the Alembic path match _create_tables().
2. SUBSTANTIVE: Unescaped f-string interpolation in migration SQL
The migration interpolates DEFAULT_OWNER directly into 5 SQL statements via f-strings without escaping single quotes:
op.execute(f"UPDATE entries SET owner_id = '{DEFAULT_OWNER}' WHERE owner_id IS NULL")postgres_store.py correctly escapes: _escaped_default_owner = _DEFAULT_OWNER.replace("'", "''"). The migration does not. A system username like O'Brien (or a crafted AWARENESS_DEFAULT_OWNER value) would break or inject SQL.
Fix: Apply the same replace("'", "''") escaping used in postgres_store.py.
3. OBSERVATION: seed-demo.sql needs owner_id or must rely on DEFAULT
Even after fixing the migration DEFAULT, seed-demo.sql omits owner_id from all INSERT column lists. This works IF the DEFAULT is set (fix #1), but it's fragile — any future removal of the DEFAULT would silently break Docker fresh installs. Consider adding owner_id explicitly to seed INSERTs for robustness.
4. OBSERVATION: argon2-cffi and phonenumbers dependencies added but unused
These are declared in pyproject.toml but no code in this PR imports them. The users table has password_hash and phone columns, but hashing/validation logic is deferred to PR 2/3. Shipping unused dependencies increases attack surface and install size. Consider deferring to the PR that actually uses them.
5. OBSERVATION: No FK constraint from owner_id to users(id)
The owner_id columns on entries/reads/actions/embeddings have no FOREIGN KEY to users(id). Likely intentional for flexibility (entries can exist before user registration), but worth confirming this is a design choice and not an oversight.
Test Results
| Check | Result |
|---|---|
| pytest (383 tests) | ✅ Pass |
| ruff (src/, alembic/) | ✅ Clean |
| mypy | ✅ Clean |
| CI (lint, typecheck, test 3.10/3.11/3.12, codecov) | ✅ All green |
| Manual #1: Fresh install via Docker | ❌ Blocked — NotNullViolation on seed |
| Manual #2–#4: MCP tool tests | ⏸️ Blocked by #1 |
| Manual #5: pytest count matches | ✅ 383/383 matches PR body |
Verdict
QA Failed — 1 blocker (missing DEFAULT in migration breaks Docker fresh install), 1 substantive (unescaped SQL interpolation). Manual tests cannot proceed until the blocker is fixed.
- Add SET DEFAULT on owner_id after SET NOT NULL in migration so the Alembic code path matches _create_tables() — fixes Docker fresh install crash on seed data - Escape single quotes in DEFAULT_OWNER for safe SQL interpolation - Remove argon2-cffi and phonenumbers from dependencies (unused in this PR, defer to PR 3 which imports them) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cmeans
left a comment
There was a problem hiding this comment.
QA Re-Review — PR #89 (Round 2)
Fix commit a1e71cc addresses all findings from Round 1. Re-verified from scratch.
Fix verification
| Finding | Status |
|---|---|
| BLOCKER: Missing SET DEFAULT in migration | ✅ Fixed — SET DEFAULT added for all 4 tables after SET NOT NULL |
| SUBSTANTIVE: Unescaped SQL interpolation | ✅ Fixed — _escaped = DEFAULT_OWNER.replace("'", "''") added, used in all 6 SQL statements |
| OBS: Unused argon2-cffi/phonenumbers deps | ✅ Removed from pyproject.toml and CHANGELOG |
| OBS: seed-demo.sql missing owner_id | Acceptable — relies on DEFAULT (now correctly set) |
| OBS: No FK on owner_id → users.id | Noted — design choice confirmed by the flexible insert pattern |
Full re-test
| Check | Result |
|---|---|
| pytest (383 tests) | ✅ 383/383 pass |
| ruff (src/, alembic/) | ✅ Clean |
| mypy | ✅ Clean |
| CI (lint, typecheck, test 3.10/3.11/3.12, codecov) | ✅ All green |
| Manual #1: Fresh Docker install + migration + seed | ✅ All 6 migrations run, 17 seed entries loaded with correct owner_id |
| Manual #2: Default owner_id on new entries | ✅ AWARENESS_DEFAULT_OWNER=qa-test-user applied via column DEFAULT |
| Manual #3: Users table with default user | ✅ Single row: qa-test-user, timezone UTC, preferences {} |
| Manual #4: Logical key upsert | ✅ Upsert deduplicates correctly; cross-owner isolation confirmed (same key, different owners = separate entries) |
| Manual #5: Full pytest suite | ✅ 383/383 pass |
Audit
- Round 1 review posted, QA Failed label applied — 1 blocker, 1 substantive, 3 observations
- Developer fix commit
a1e71ccaddresses blocker + substantive + 1 observation - Round 2: Full re-test clean. All 5 manual checkboxes checked.
- Zero open findings.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1304d26 to
e6f508c
Compare
cmeans
left a comment
There was a problem hiding this comment.
Dev: You didn't follow the directive for footnotes. Fix.
cmeans
left a comment
There was a problem hiding this comment.
QA Quick Review — Round 3
Commit e6f508c: footer text change in docs/data-dictionary.md — standardizes to ecosystem branding. Text-only, no code impact. ✅
All prior test results still valid. Ready for QA Signoff stands.
Summary
owner_id TEXT NOT NULLto all 4 data tables (entries, reads, actions, embeddings) with backfill fromAWARENESS_DEFAULT_OWNERenv varuserstable with full schema: email + canonical normalization for anti-abuse, E.164 phone (not unique — shared lines), argon2id password_hash, timezone, preferences JSONBowner_id_create_tables()DDL updated for fresh installs with column DEFAULTs for backward compatibilityPR 1 of 3 for multi-tenant foundation (Phase 1). See plan:
.claude/plans/glowing-beaming-fairy.mdQA
Prerequisites
pip install -e ".[dev]"AWARENESS_PORT=8421)Manual tests (via MCP tools)
Run
alembic upgrade headon an instance with existing data.Expected: existing entries returned with no errors
Then verify via direct SQL:
SELECT owner_id FROM entries WHERE source = 'test' ORDER BY created DESC LIMIT 1Expected: owner_id matches
AWARENESS_DEFAULT_OWNERor system usernameDirect SQL:
SELECT * FROM usersExpected: at least one row with the default owner ID
Expected: single entry with description "updated", not duplicated
pytest tests/ -v— 383 tests should pass🤖 Generated with Claude Code