fix(core): use composite key (name, default_cwd) for codebase identity#1236
fix(core): use composite key (name, default_cwd) for codebase identity#1236halindrome wants to merge 5 commits intocoleam00:devfrom
Conversation
…eam00#1192) Codebase identity was derived solely from the remote URL name (owner/repo), causing multiple local clones of the same remote to share a single codebase_id. This leaked conversations, sessions, env vars, and isolation environments across clones. Changes: - Add findCodebaseByNameAndPath() for composite (name, path) lookups - Update registerRepoAtPath to use composite dedup: same name + different local path now creates a distinct codebase row - Preserve managed-to-local path upgrade (archon workspace → local checkout) - Add UNIQUE INDEX on (name, default_cwd) for both SQLite and PostgreSQL - Backward compatible: existing single-clone installs are found by findCodebaseByDefaultCwd before registerRepoAtPath is reached Closes coleam00#1192 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a composite codebase identity (name + default_cwd) enforced by a DB unique index and applied in combined schema; runs the migration during SQLite init (with logging on failure), exposes a new DB lookup by (name, path), and updates registration logic and tests to deduplicate and upgrade using that composite identity. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Handler as RegisterHandler
participant DB as CodebaseDB
participant Adapter as SQLiteAdapter
Client->>Handler: registerRepoAtPath(name, targetPath, repositoryUrl...)
Handler->>DB: findCodebaseByNameAndPath(name, targetPath)
alt composite match
DB-->>Handler: existing codebase
Handler->>Adapter: load commands using existing.default_cwd
Handler-->>Client: return existing registration (alreadyExisted)
else composite miss
Handler->>DB: findCodebaseByName(name)
alt name-only matched AND existing.default_cwd is archon-managed AND targetPath is local
DB-->>Handler: managed record
Handler->>DB: updateCodebase(id, { default_cwd: targetPath, repository_url? })
Handler->>Adapter: load commands using updated/default_cwd
Handler-->>Client: return updated registration (alreadyExisted)
else no suitable match
Handler->>DB: createCodebase(...)
DB-->>Handler: new codebase
Handler-->>Client: return new registration
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/core/src/handlers/clone.ts (1)
77-115:⚠️ Potential issue | 🟡 MinorReturn the backfilled
repositoryUrlfrom the exact-match branch.This branch persists
repository_urlwhen the existing row is missing it, but the response still returnsexisting.repository_url, so callers getnulleven after a successful backfill.Suggested fix
return { codebaseId: existing.id, name: existing.name, - repositoryUrl: existing.repository_url, + repositoryUrl: existing.repository_url ?? repositoryUrl, defaultCwd: existing.default_cwd, commandCount: commandsLoaded, alreadyExisted: true, };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/core/src/handlers/clone.ts` around lines 77 - 115, The return value still uses existing.repository_url even when you backfill it into updates; change the returned repositoryUrl to reflect the possibly-updated value by returning repositoryUrl if you wrote it (i.e., when updates.repository_url or repositoryUrl is set) otherwise fall back to existing.repository_url; locate the backfill logic around the variables existing, updates, repositoryUrl and the call to codebaseDb.updateCodebase and adjust the returned object so repositoryUrl returns the new/backfilled value rather than always existing.repository_url.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/core/src/db/adapters/sqlite.ts`:
- Around line 227-234: The migration function migrateCodebaseIdentity currently
swallows errors from this.db.run(...) and only logs via
getLog().warn('db.sqlite_migration_codebase_identity_failed'), which allows
startup to continue in an unsafe state; change the error handling so that after
catching the error you log the failure (including the error) and then rethrow or
terminate startup (e.g., throw a new Error) so the process fails fast instead of
continuing with a broken composite identity; update the catch block referencing
migrateCodebaseIdentity, this.db.run, and the existing log key so the intent is
clear and startup halts on migration failure.
In `@packages/core/src/handlers/clone.ts`:
- Around line 120-139: When upgrading a managed checkout to a local path (the
branch that checks nameMatch and updates default_cwd via
codebaseDb.updateCodebase and returns commandCount: 0), also invoke the same
command discovery/registration routine used by the exact-match and create-new
paths so the .archon/commands for the new local checkout are scanned and
commandCount reflects discovered commands; call that routine using the updated
codebase id (nameMatch.id) and targetPath, and return its commandCount and any
updated repositoryUrl/defaultCwd instead of always returning 0.
---
Outside diff comments:
In `@packages/core/src/handlers/clone.ts`:
- Around line 77-115: The return value still uses existing.repository_url even
when you backfill it into updates; change the returned repositoryUrl to reflect
the possibly-updated value by returning repositoryUrl if you wrote it (i.e.,
when updates.repository_url or repositoryUrl is set) otherwise fall back to
existing.repository_url; locate the backfill logic around the variables
existing, updates, repositoryUrl and the call to codebaseDb.updateCodebase and
adjust the returned object so repositoryUrl returns the new/backfilled value
rather than always existing.repository_url.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6f2f0dad-0f1c-496e-abac-fbacf9d1b71e
📒 Files selected for processing (5)
migrations/022_codebase_composite_identity.sqlpackages/core/src/db/adapters/sqlite.tspackages/core/src/db/codebases.tspackages/core/src/handlers/clone.test.tspackages/core/src/handlers/clone.ts
…idance - Update 000_combined.sql version comment (001-020 → 001-022) and add the new idx_codebases_name_cwd unique index for fresh PostgreSQL installs - Add pre-check query and dedup guidance to migration 022 for databases that may have duplicate (name, default_cwd) rows Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
QA Report — Round 1 — PR #1236 (composite codebase identity key)Model used: claude-opus-4-6 (1M context) What Was Tested
FindingsFinding 1:
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
migrations/000_combined.sql (1)
316-318: Mirror duplicate pre-check guidance next to the new unique index.The index is correct, but adding the same pre-check/cleanup note from
migrations/022_codebase_composite_identity.sqlhere would reduce operational surprises when this combined script is applied to legacy DBs.Suggested doc-only patch
-- From migration 022: composite codebase identity (name + path) +-- Pre-check for duplicates before applying on legacy DBs: +-- SELECT name, default_cwd, COUNT(*) +-- FROM remote_agent_codebases +-- GROUP BY name, default_cwd +-- HAVING COUNT(*) > 1; +-- If duplicates exist, merge/delete extras before creating this index. CREATE UNIQUE INDEX IF NOT EXISTS idx_codebases_name_cwd ON remote_agent_codebases (name, default_cwd);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@migrations/000_combined.sql` around lines 316 - 318, Add the same pre-check/cleanup guidance used when introducing the composite codebase identity to the CREATE UNIQUE INDEX statement for idx_codebases_name_cwd on remote_agent_codebases: insert a comment immediately above the index creation that documents running a dedupe/cleanup step (or dropping conflicting duplicate rows) before applying the unique index to avoid constraint errors on legacy DBs, matching the text/steps from the earlier migration that introduced the composite identity (migration 022).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@migrations/000_combined.sql`:
- Around line 316-318: Add the same pre-check/cleanup guidance used when
introducing the composite codebase identity to the CREATE UNIQUE INDEX statement
for idx_codebases_name_cwd on remote_agent_codebases: insert a comment
immediately above the index creation that documents running a dedupe/cleanup
step (or dropping conflicting duplicate rows) before applying the unique index
to avoid constraint errors on legacy DBs, matching the text/steps from the
earlier migration that introduced the composite identity (migration 022).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fcc29160-c1c2-4189-befd-6c39ee212f4d
📒 Files selected for processing (2)
migrations/000_combined.sqlmigrations/022_codebase_composite_identity.sql
✅ Files skipped from review due to trivial changes (1)
- migrations/022_codebase_composite_identity.sql
- Add command loading to managed-to-local upgrade path (was returning commandCount: 0 without scanning .archon/commands/ at the new path) - Batch default_cwd + repository_url into a single updateCodebase call - Update upgrade test to verify command loading and batched update Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
QA Report — Round 2 — PR #1236 (composite codebase identity key)Model used: Claude Opus 4.6 (1M context) What Was Tested
FindingsFinding 1: Managed-to-local upgrade skips command loading — Moderate (confirmed)The upgrade path returned Fixed in: Finding 2: Two separate updateCodebase calls instead of one — Low (confirmed)The upgrade path made two DB round-trips for Fixed in: Overall RecommendationPASS with minor fix needed → both findings fixed. |
Add idx_codebases_name_cwd to SQLite createSchema() so fresh installs get the unique index from the table creation block, consistent with the PostgreSQL combined migration. The migrateCodebaseIdentity() call remains as a no-op safety net for existing databases. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
QA Report — Round 3 — PR #1236 (composite codebase identity key)Model used: Claude Opus 4.6 (1M context) What Was Tested
FindingsFinding 1: SQLite
|
…rn backfilled URL - SQLite migrateCodebaseIdentity now throws on failure instead of warn-and-continue, per fail-fast principle (duplicate rows must be resolved before startup) - Exact-match return now uses `existing.repository_url ?? repositoryUrl` so callers see the backfilled URL immediately Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Addressing the outside-diff CodeRabbit comment on Fixed in All three CodeRabbit comments are now resolved:
|
|
Thanks for this, @halindrome — the fix is architecturally right and the migration is safe. I'd like to get your thoughts on a follow-up class before we merge, since this PR enables the ambiguous state the concerns below live in. Concern: name-only and URL-only lookups are now silently ambiguous when N clones existFour call sites in the codebase still resolve codebases by name-or-URL only (not by `(name, path)`):
And one application-layer fallback this PR introduces:
These aren't regressions from your PR — the lookups behaved this way before. The difference is that pre-PR the ambiguous state was unreachable (name-only dedup meant at most one row per name). Post-PR the state is reachable by design, so the ambiguity becomes observable the first time someone registers two clones and then gets a webhook. Concrete example:
The natural follow-up: multi-project base foldersThere's a related UX pattern that today has no first-class support and shares the same ambiguity class — service-architecture / microservices layouts: ``` Your PR handles this correctly at the identity layer — each sub-dir registers as its own codebase with its own `default_cwd` and distinct id. But the adapter-side routing is still "by remote URL" — if two services share a naming pattern or someone has a mirror clone, same ambiguity. And Archon has no concept of a "project group" or "parent folder context" — each workflow run is one-codebase-one-cwd. I suspect the right resolution is user-configured routing:
None of this is for this PR. What I want to get your view on: Questions for you
Happy to file the routing-disambiguation + multi-project-folder follow-ups as separate issues after we align here, and merge this PR once your answers are in. Really solid work — the fail-fast migration and the QA-round commits in particular. |
|
My comments inline:
I agree the separation is clean.
This is an interesting question. In general I don't like quiet failure. Interim is fine; don't fail loudly yet. Two reasons:
What I'd suggest as a cheap intermediate: a log.warn (not error) inside findCodebaseByRepoUrl / the in-memory findCodebaseByName helper when rows.length > 1,
I literally have this situation on my machine right now. I would characterize this as a monorepo where the top level folder is not under git at all. It is just a collection of related modules that need to be considered in toto to understand the architecture. A related subtle issue is when the top level 'project' folder IS under git, but has no remote (local-only). However, a few things need design first:
I'd land routing disambiguation first (it's the immediate user-facing issue) and treat multi-project as a separate design discussion after. Happy to file the issue and sketch a strawman, but I don't want to scope-creep this PR or the immediate follow-up. Or you can do it. Let me know!
Yes — your suggested guard is the right one and I'd prefer to apply it in this PR rather than ship a known-fragile path. Specifically I'd tighten the upgrade branch
Without (2), this sequence is broken:
Implementation is a single extra query (findCodebasesByName returning all matches, count the non-managed ones) and one added condition. Happy to push that as a fixup commit on this PR if you'd like, or leave the upgrade branch alone and let it become a no-op once we add a getArchonWorkspacesPath()-anchored check in the routing-disambiguation PR — your call
|
Summary
Closes #1192 — the architectural root cause behind the cascade of cross-clone issues (#1183, #1186, #1188, #1198, #1206).
Codebase identity was derived solely from the remote URL name (
owner/repo), so multiple local clones of the same remote shared a singlecodebase_id. This caused conversations, sessions, env vars, isolation environments, and commands to leak across clones.Changes
findCodebaseByNameAndPath(name, defaultCwd)— new composite lookup inpackages/core/src/db/codebases.tsregisterRepoAtPath— uses composite dedup: same name + different local path now creates a distinct codebase row; preserves managed-to-local path upgrade (archon workspace → local checkout)UNIQUE INDEX (name, default_cwd)— added to both SQLite (migrateCodebaseIdentity()) and PostgreSQL (migrations/022_codebase_composite_identity.sql)registerRepositorycallsfindCodebaseByDefaultCwdfirst, so existing registrations where the directory name does not match the remote-derived name are found by path beforeregisterRepoAtPathis reachedfindCodebaseByNameis preserved for non-path contexts (Slack/Telegram/GitHub adapters)(name, path)pairsWhat this retires
With distinct
codebase_idper clone, the cross-clone guard code from #1186, #1198, and #1206 becomes redundant (distinct worktree namespaces → no possible collision). Those guards can be removed in a follow-up once this is validated.Test plan
bun run type-checkpasses (all 10 packages)bun run lintpasses (0 warnings)bun run testpasses (all packages, 0 failures)codebase_id🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Database
Tests