Skip to content

fix(core): use composite key (name, default_cwd) for codebase identity#1236

Open
halindrome wants to merge 5 commits intocoleam00:devfrom
halindrome:fix/codebase-identity-composite-key
Open

fix(core): use composite key (name, default_cwd) for codebase identity#1236
halindrome wants to merge 5 commits intocoleam00:devfrom
halindrome:fix/codebase-identity-composite-key

Conversation

@halindrome
Copy link
Copy Markdown
Contributor

@halindrome halindrome commented Apr 15, 2026

Summary

Closes #1192 — the architectural root cause behind the cascade of cross-clone issues (#1183, #1186, #1188, #1198, #1206).

Codebase identity was derived solely from the remote URL name (owner/repo), so multiple local clones of the same remote shared a single codebase_id. This caused conversations, sessions, env vars, isolation environments, and commands to leak across clones.

Changes

  • findCodebaseByNameAndPath(name, defaultCwd) — new composite lookup in packages/core/src/db/codebases.ts
  • registerRepoAtPath — uses composite dedup: same name + different local path now creates a distinct codebase row; preserves managed-to-local path upgrade (archon workspace → local checkout)
  • UNIQUE INDEX (name, default_cwd) — added to both SQLite (migrateCodebaseIdentity()) and PostgreSQL (migrations/022_codebase_composite_identity.sql)
  • Backward compatible — existing single-clone installs continue to work unchanged:
    • registerRepository calls findCodebaseByDefaultCwd first, so existing registrations where the directory name does not match the remote-derived name are found by path before registerRepoAtPath is reached
    • findCodebaseByName is preserved for non-path contexts (Slack/Telegram/GitHub adapters)
    • No schema migration needed for existing data — the unique index only prevents future duplicate (name, path) pairs

What this retires

With distinct codebase_id per clone, the cross-clone guard code from #1186, #1198, and #1206 becomes redundant (distinct worktree namespaces → no possible collision). Those guards can be removed in a follow-up once this is validated.

Test plan

  • All 45 clone.test.ts tests pass (5 new composite identity tests)
  • Full bun run type-check passes (all 10 packages)
  • Full bun run lint passes (0 warnings)
  • Full bun run test passes (all packages, 0 failures)
  • Manual: register two local clones of the same remote → verify distinct codebase_id
  • Manual: re-register an existing single-clone project → verify it reuses the same record
  • Manual: verify existing SQLite databases get the unique index on startup

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Deduplication now uses a composite name+path identity, preventing duplicate codebase entries and improving re-registration and path-upgrade behavior; re-registration preserves existing paths when matched.
  • Database

    • Database-level unique constraint added to enforce the name+path composite identity and included in schema initialization.
  • Tests

    • Deduplication tests updated for composite behavior and include backward-compatibility scenarios; tests cover repository URL fill-in and upgrade flows.

…eam00#1192)

Codebase identity was derived solely from the remote URL name (owner/repo),
causing multiple local clones of the same remote to share a single codebase_id.
This leaked conversations, sessions, env vars, and isolation environments
across clones.

Changes:
- Add findCodebaseByNameAndPath() for composite (name, path) lookups
- Update registerRepoAtPath to use composite dedup: same name + different
  local path now creates a distinct codebase row
- Preserve managed-to-local path upgrade (archon workspace → local checkout)
- Add UNIQUE INDEX on (name, default_cwd) for both SQLite and PostgreSQL
- Backward compatible: existing single-clone installs are found by
  findCodebaseByDefaultCwd before registerRepoAtPath is reached

Closes coleam00#1192

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 15, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 58fdd4aa-2b5a-4d72-92e9-8ae552c9e01d

📥 Commits

Reviewing files that changed from the base of the PR and between 2df6235 and 2baf383.

📒 Files selected for processing (2)
  • packages/core/src/db/adapters/sqlite.ts
  • packages/core/src/handlers/clone.ts
✅ Files skipped from review due to trivial changes (1)
  • packages/core/src/db/adapters/sqlite.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/core/src/handlers/clone.ts

📝 Walkthrough

Walkthrough

Adds a composite codebase identity (name + default_cwd) enforced by a DB unique index and applied in combined schema; runs the migration during SQLite init (with logging on failure), exposes a new DB lookup by (name, path), and updates registration logic and tests to deduplicate and upgrade using that composite identity.

Changes

Cohort / File(s) Summary
Database migrations
migrations/022_codebase_composite_identity.sql, migrations/000_combined.sql
Add idempotent unique index idx_codebases_name_cwd on remote_agent_codebases(name, default_cwd); update combined migration metadata to include migration 022.
SQLite adapter init
packages/core/src/db/adapters/sqlite.ts
Invoke new migration step migrateCodebaseIdentity() during schema init; create the composite unique index during init and migration; on migration failure log a metric (db.sqlite_migration_codebase_identity_failed) and rethrow a wrapped Error.
Codebase lookup API
packages/core/src/db/codebases.ts
Add exported `findCodebaseByNameAndPath(name, defaultCwd): Promise<Codebase
Registration handler
packages/core/src/handlers/clone.ts
registerRepoAtPath now first deduplicates by (name, targetPath) via findCodebaseByNameAndPath; on composite miss fallback to findCodebaseByName and, if existing record is archon-managed and incoming path is local, upgrade default_cwd (and set missing repository_url), otherwise create a new codebase.
Handler tests
packages/core/src/handlers/clone.test.ts
Add findCodebaseByNameAndPath mock and reset; rewrite deduplication tests to assert composite-identity behavior, local-upgrade scenarios, re-registration behavior, fill-in repository_url expectations, and add backward-compatibility discovery test using findCodebaseByDefaultCwd.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Handler as RegisterHandler
  participant DB as CodebaseDB
  participant Adapter as SQLiteAdapter

  Client->>Handler: registerRepoAtPath(name, targetPath, repositoryUrl...)
  Handler->>DB: findCodebaseByNameAndPath(name, targetPath)
  alt composite match
    DB-->>Handler: existing codebase
    Handler->>Adapter: load commands using existing.default_cwd
    Handler-->>Client: return existing registration (alreadyExisted)
  else composite miss
    Handler->>DB: findCodebaseByName(name)
    alt name-only matched AND existing.default_cwd is archon-managed AND targetPath is local
      DB-->>Handler: managed record
      Handler->>DB: updateCodebase(id, { default_cwd: targetPath, repository_url? })
      Handler->>Adapter: load commands using updated/default_cwd
      Handler-->>Client: return updated registration (alreadyExisted)
    else no suitable match
      Handler->>DB: createCodebase(...)
      DB-->>Handler: new codebase
      Handler-->>Client: return new registration
    end
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐇 I hopped through tables, two paths but one name,
I wove a tiny index to keep each home tame.
Now clones may flourish, each with its own ground,
I twitch, I nibble, and cheer for the new bounds. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: implementing a composite key (name, default_cwd) for codebase identity, which is the core architectural fix addressed by this PR.
Description check ✅ Passed The description covers all critical sections: problem statement, root cause, solution approach, changes made, backward compatibility assurance, test plan with specific test counts, and validation status. It directly addresses issue #1192.
Linked Issues check ✅ Passed The PR fully implements the composite identity approach (Option A) from issue #1192, including the new findCodebaseByNameAndPath function, updated registerRepoAtPath deduplication logic, and UNIQUE INDEX on (name, default_cwd) in both SQLite and PostgreSQL migrations.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the composite codebase identity feature: database migrations, the new lookup function, deduplication logic in clone handler, and corresponding test updates. No unrelated changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/core/src/handlers/clone.ts (1)

77-115: ⚠️ Potential issue | 🟡 Minor

Return the backfilled repositoryUrl from the exact-match branch.

This branch persists repository_url when the existing row is missing it, but the response still returns existing.repository_url, so callers get null even after a successful backfill.

Suggested fix
     return {
       codebaseId: existing.id,
       name: existing.name,
-      repositoryUrl: existing.repository_url,
+      repositoryUrl: existing.repository_url ?? repositoryUrl,
       defaultCwd: existing.default_cwd,
       commandCount: commandsLoaded,
       alreadyExisted: true,
     };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/handlers/clone.ts` around lines 77 - 115, The return value
still uses existing.repository_url even when you backfill it into updates;
change the returned repositoryUrl to reflect the possibly-updated value by
returning repositoryUrl if you wrote it (i.e., when updates.repository_url or
repositoryUrl is set) otherwise fall back to existing.repository_url; locate the
backfill logic around the variables existing, updates, repositoryUrl and the
call to codebaseDb.updateCodebase and adjust the returned object so
repositoryUrl returns the new/backfilled value rather than always
existing.repository_url.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/core/src/db/adapters/sqlite.ts`:
- Around line 227-234: The migration function migrateCodebaseIdentity currently
swallows errors from this.db.run(...) and only logs via
getLog().warn('db.sqlite_migration_codebase_identity_failed'), which allows
startup to continue in an unsafe state; change the error handling so that after
catching the error you log the failure (including the error) and then rethrow or
terminate startup (e.g., throw a new Error) so the process fails fast instead of
continuing with a broken composite identity; update the catch block referencing
migrateCodebaseIdentity, this.db.run, and the existing log key so the intent is
clear and startup halts on migration failure.

In `@packages/core/src/handlers/clone.ts`:
- Around line 120-139: When upgrading a managed checkout to a local path (the
branch that checks nameMatch and updates default_cwd via
codebaseDb.updateCodebase and returns commandCount: 0), also invoke the same
command discovery/registration routine used by the exact-match and create-new
paths so the .archon/commands for the new local checkout are scanned and
commandCount reflects discovered commands; call that routine using the updated
codebase id (nameMatch.id) and targetPath, and return its commandCount and any
updated repositoryUrl/defaultCwd instead of always returning 0.

---

Outside diff comments:
In `@packages/core/src/handlers/clone.ts`:
- Around line 77-115: The return value still uses existing.repository_url even
when you backfill it into updates; change the returned repositoryUrl to reflect
the possibly-updated value by returning repositoryUrl if you wrote it (i.e.,
when updates.repository_url or repositoryUrl is set) otherwise fall back to
existing.repository_url; locate the backfill logic around the variables
existing, updates, repositoryUrl and the call to codebaseDb.updateCodebase and
adjust the returned object so repositoryUrl returns the new/backfilled value
rather than always existing.repository_url.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6f2f0dad-0f1c-496e-abac-fbacf9d1b71e

📥 Commits

Reviewing files that changed from the base of the PR and between 3dedc22 and d029d1e.

📒 Files selected for processing (5)
  • migrations/022_codebase_composite_identity.sql
  • packages/core/src/db/adapters/sqlite.ts
  • packages/core/src/db/codebases.ts
  • packages/core/src/handlers/clone.test.ts
  • packages/core/src/handlers/clone.ts

Comment thread packages/core/src/db/adapters/sqlite.ts
Comment thread packages/core/src/handlers/clone.ts
…idance

- Update 000_combined.sql version comment (001-020 → 001-022) and add the
  new idx_codebases_name_cwd unique index for fresh PostgreSQL installs
- Add pre-check query and dedup guidance to migration 022 for databases
  that may have duplicate (name, default_cwd) rows

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@halindrome
Copy link
Copy Markdown
Contributor Author

QA Report — Round 1 — PR #1236 (composite codebase identity key)

Model used: claude-opus-4-6 (1M context)

What Was Tested

  • Migration safety for SQLite (inline migrateCodebaseIdentity in SqliteAdapter)
  • Migration safety for PostgreSQL (migrations/022_codebase_composite_identity.sql)
  • Combined migration (000_combined.sql) freshness for new installs
  • findCodebaseByNameAndPath new DB function correctness
  • registerRepoAtPath composite dedup logic (3-tier: composite match, name-only upgrade, create new)
  • registerRepository backward compatibility (path-first guard via findCodebaseByDefaultCwd)
  • cloneRepository flow — both remote URL and local path delegation paths
  • Forge adapter (GitHubAdapter.getOrCreateCodebaseForRepo) interaction with new unique index
  • Race condition analysis for concurrent registrations
  • Test coverage for new composite identity logic
  • Callers of findCodebaseByName (DB-level) outside of clone.ts

Findings

Finding 1: 000_combined.sql not updated with migration 022 — Important (confirmed)

The combined migration header said "001-020" and did not include idx_codebases_name_cwd. New PostgreSQL installs using the combined migration would not get the unique constraint.

Fixed in: fix(db): address QA round 1 — update combined migration, add dedup guidance

Finding 2: PostgreSQL migration 022 should include dedup guidance — Important (hypothetical)

CREATE UNIQUE INDEX IF NOT EXISTS will fail if pre-existing duplicate (name, default_cwd) rows exist. Added pre-check query and guidance comment.

Fixed in: fix(db): address QA round 1 — update combined migration, add dedup guidance

Overall Recommendation

PASS with minor notes — both important findings were migration-path issues, now fixed. Core logic is sound.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
migrations/000_combined.sql (1)

316-318: Mirror duplicate pre-check guidance next to the new unique index.

The index is correct, but adding the same pre-check/cleanup note from migrations/022_codebase_composite_identity.sql here would reduce operational surprises when this combined script is applied to legacy DBs.

Suggested doc-only patch
 -- From migration 022: composite codebase identity (name + path)
+-- Pre-check for duplicates before applying on legacy DBs:
+--   SELECT name, default_cwd, COUNT(*)
+--   FROM remote_agent_codebases
+--   GROUP BY name, default_cwd
+--   HAVING COUNT(*) > 1;
+-- If duplicates exist, merge/delete extras before creating this index.
 CREATE UNIQUE INDEX IF NOT EXISTS idx_codebases_name_cwd
   ON remote_agent_codebases (name, default_cwd);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@migrations/000_combined.sql` around lines 316 - 318, Add the same
pre-check/cleanup guidance used when introducing the composite codebase identity
to the CREATE UNIQUE INDEX statement for idx_codebases_name_cwd on
remote_agent_codebases: insert a comment immediately above the index creation
that documents running a dedupe/cleanup step (or dropping conflicting duplicate
rows) before applying the unique index to avoid constraint errors on legacy DBs,
matching the text/steps from the earlier migration that introduced the composite
identity (migration 022).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@migrations/000_combined.sql`:
- Around line 316-318: Add the same pre-check/cleanup guidance used when
introducing the composite codebase identity to the CREATE UNIQUE INDEX statement
for idx_codebases_name_cwd on remote_agent_codebases: insert a comment
immediately above the index creation that documents running a dedupe/cleanup
step (or dropping conflicting duplicate rows) before applying the unique index
to avoid constraint errors on legacy DBs, matching the text/steps from the
earlier migration that introduced the composite identity (migration 022).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fcc29160-c1c2-4189-befd-6c39ee212f4d

📥 Commits

Reviewing files that changed from the base of the PR and between d029d1e and 8d269b8.

📒 Files selected for processing (2)
  • migrations/000_combined.sql
  • migrations/022_codebase_composite_identity.sql
✅ Files skipped from review due to trivial changes (1)
  • migrations/022_codebase_composite_identity.sql

- Add command loading to managed-to-local upgrade path (was returning
  commandCount: 0 without scanning .archon/commands/ at the new path)
- Batch default_cwd + repository_url into a single updateCodebase call
- Update upgrade test to verify command loading and batched update

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@halindrome
Copy link
Copy Markdown
Contributor Author

QA Report — Round 2 — PR #1236 (composite codebase identity key)

Model used: Claude Opus 4.6 (1M context)

What Was Tested

  • Full diff review (2 commits at time of review)
  • registerRepoAtPath all code paths including managed-to-local upgrade
  • cloneRepository pre-existing clone path
  • Orchestrator in-memory findCodebaseByName — confirmed unaffected
  • CLI workflowRunCommand auto-registration — confirmed correct delegation
  • handleRegisterProject direct createCodebase — confirmed safe
  • Unique index violation analysis on upgrade path
  • Test mock completeness
  • findCodebaseByName ORDER BY behavior with multiple matches

Findings

Finding 1: Managed-to-local upgrade skips command loading — Moderate (confirmed)

The upgrade path returned commandCount: 0 without scanning .archon/commands/ at the new local path. This was a behavioral regression from the original code which loaded commands using effectiveCwd.

Fixed in: fix(core): address QA round 2 — restore command loading on path upgrade

Finding 2: Two separate updateCodebase calls instead of one — Low (confirmed)

The upgrade path made two DB round-trips for default_cwd and repository_url. Batched into a single call.

Fixed in: fix(core): address QA round 2 — restore command loading on path upgrade

Overall Recommendation

PASS with minor fix needed → both findings fixed.

Add idx_codebases_name_cwd to SQLite createSchema() so fresh installs
get the unique index from the table creation block, consistent with the
PostgreSQL combined migration. The migrateCodebaseIdentity() call
remains as a no-op safety net for existing databases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@halindrome
Copy link
Copy Markdown
Contributor Author

QA Report — Round 3 — PR #1236 (composite codebase identity key)

Model used: Claude Opus 4.6 (1M context)

What Was Tested

  • Full registerRepoAtPath three-tier logic after R2 fixes
  • R2 fix correctness: command loading on managed-to-local upgrade
  • R1 fix correctness: combined migration and migration 022
  • Edge case: re-cloning after deleting workspace dir but not DB row
  • Edge case: symlink path comparison
  • Edge case: unique index violation on managed-to-local upgrade
  • cloneRepository managed-path clone flow
  • findCodebaseByName ORDER BY semantics for upgrade path
  • SQLite createSchema() index consistency with PostgreSQL combined migration
  • Test coverage completeness

Findings

Finding 1: SQLite createSchema() missing composite index — Low (confirmed)

Fresh SQLite installs got the index via migrateCodebaseIdentity() but not from createSchema(), inconsistent with the PostgreSQL combined migration. Added the index to createSchema() for consistency; migrateCodebaseIdentity() remains as a no-op safety net.

Fixed in: fix(db): address QA round 3 — add composite index to createSchema

Finding 2: Re-cloning creates duplicate row when local registration exists — Low (by design)

When a user first registers locally then clones from URL, a second row is created. This is consistent with the composite identity model (different paths = different codebases). Not a bug.

Overall Recommendation

PASS — Round 3 came back clean. Only one low-severity consistency fix and one by-design observation. Core logic, migrations, backward compatibility, and test coverage are all sound after 3 rounds of QA.

…rn backfilled URL

- SQLite migrateCodebaseIdentity now throws on failure instead of
  warn-and-continue, per fail-fast principle (duplicate rows must be
  resolved before startup)
- Exact-match return now uses `existing.repository_url ?? repositoryUrl`
  so callers see the backfilled URL immediately

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@halindrome
Copy link
Copy Markdown
Contributor Author

Addressing the outside-diff CodeRabbit comment on clone.ts:77-115 (return backfilled repositoryUrl):

Fixed in 2baf383b. The exact-match return now uses existing.repository_url ?? repositoryUrl so callers see the backfilled URL immediately after a successful update, rather than getting null back.

All three CodeRabbit comments are now resolved:

Comment Status
sqlite.ts:234 — fail-fast on migration Fixed in 2baf383b
clone.ts:167 — command loading on upgrade Fixed in 4f4b33bc
clone.ts:77-115 — return backfilled repositoryUrl Fixed in 2baf383b

@Wirasm
Copy link
Copy Markdown
Collaborator

Wirasm commented Apr 22, 2026

Thanks for this, @halindrome — the fix is architecturally right and the migration is safe. I'd like to get your thoughts on a follow-up class before we merge, since this PR enables the ambiguous state the concerns below live in.

Concern: name-only and URL-only lookups are now silently ambiguous when N clones exist

Four call sites in the codebase still resolve codebases by name-or-URL only (not by `(name, path)`):

Site Lookup Resolution when N matches
`packages/adapters/src/forge/github/adapter.ts:580` `findCodebaseByRepoUrl(url)` `ORDER BY created_at DESC LIMIT 1` — newest wins
`packages/adapters/src/community/forge/gitlab/adapter.ts:546` same same
`packages/adapters/src/community/forge/gitea/adapter.ts:607` same same
`packages/core/src/orchestrator/orchestrator-agent.ts:97` (in-memory `findCodebaseByName` helper for Slack/Telegram/Discord routing) array `.find()` first in load order

And one application-layer fallback this PR introduces:

  • `packages/core/src/handlers/clone.ts:156` — name-only lookup after composite miss, for the managed→local upgrade branch. If a user has N registered clones of the same name and registers a new path, this picks one via `created_at DESC` to decide whether to apply the upgrade, which could be surprising.

These aren't regressions from your PR — the lookups behaved this way before. The difference is that pre-PR the ambiguous state was unreachable (name-only dedup meant at most one row per name). Post-PR the state is reachable by design, so the ambiguity becomes observable the first time someone registers two clones and then gets a webhook.

Concrete example:

  1. User clones `coleam00/Archon` at `/prod/Archon` (for hotfixes) and `/dev/Archon` (for feature work). Two codebase rows, distinct ids. ✅
  2. GitHub webhook fires for a PR review comment on `coleam00/Archon`.
  3. `findCodebaseByRepoUrl('https://github.com/coleam00/Archon')\` returns the most recently registered one. Workflow runs against that clone, regardless of which the user intended.

The natural follow-up: multi-project base folders

There's a related UX pattern that today has no first-class support and shares the same ambiguity class — service-architecture / microservices layouts:

```
~/dev/my-platform/ ← base folder (not a git repo, just a container)
├── service-api/ ← git repo (service 1)
├── service-worker/ ← git repo (service 2)
├── service-web/ ← git repo (service 3)
└── shared-infra/ ← git repo (shared stuff)
```

Your PR handles this correctly at the identity layer — each sub-dir registers as its own codebase with its own `default_cwd` and distinct id. But the adapter-side routing is still "by remote URL" — if two services share a naming pattern or someone has a mirror clone, same ambiguity. And Archon has no concept of a "project group" or "parent folder context" — each workflow run is one-codebase-one-cwd.

I suspect the right resolution is user-configured routing:

  • Per-webhook/per-channel binding: "Route GitHub webhooks for `coleam00/Archon` on this installation to codebase `~/dev/Archon`" — explicit, stored in the DB, surfaces the ambiguity to the user instead of silently picking.
  • Parent-folder awareness (later): a user can point Archon at `~/dev/my-platform/` and it discovers all sub-repos; a workflow invocation can optionally target "all services" or "one service" via a picker. Separate feature, but the composite identity from this PR is the foundation.

None of this is for this PR. What I want to get your view on:

Questions for you

  1. Do you see the name/URL-only lookups as in-scope for this PR, or is the separation clean? My read is clean separation — identity dedup (this PR) vs. routing disambiguation (follow-up). But you've been closer to the code path than I have on this one.

  2. Is the `created_at DESC LIMIT 1` fallback in name-only lookups good enough as the interim behavior, or worth changing to fail-loudly ("multiple clones match — specify which") before merging this PR? Leaning "interim is fine" since the state is only reachable when the user explicitly registers distinct clones, but open to your take.

  3. Any thought on the multi-project-base-folder direction? Is this something you'd want to tackle as a follow-up after this lands, or is it adjacent enough that someone else should pick it up?

  4. Anything about the `findCodebaseByName` fallback in `registerRepoAtPath` (the name-only branch after composite miss, for managed→local upgrade) that would be better done differently given the N-clone future? E.g. restrict the upgrade to only fire when the name-match's `default_cwd` is inside `getArchonWorkspacesPath()` AND no other non-managed rows with the same name exist?

Happy to file the routing-disambiguation + multi-project-folder follow-ups as separate issues after we align here, and merge this PR once your answers are in. Really solid work — the fail-fast migration and the QA-round commits in particular.

@halindrome
Copy link
Copy Markdown
Contributor Author

My comments inline:

Questions for you

  1. Do you see the name/URL-only lookups as in-scope for this PR, or is the separation clean? My read is clean separation — identity dedup (this PR) vs. routing disambiguation (follow-up). But you've been closer to the code path than I have on this one.

I agree the separation is clean.

  1. Is the created_at DESC LIMIT 1 fallback in name-only lookups good enough as the interim behavior, or worth changing to fail-loudly ("multiple clones match — specify which") before merging this PR? Leaning "interim is fine" since the state is only reachable when the user explicitly registers distinct clones, but open to your take.

This is an interesting question. In general I don't like quiet failure. Interim is fine; don't fail loudly yet. Two reasons:

  • The 99%+ case is one clone per repo. Fail-loud here forces every existing single-clone user through an error path on the first webhook after upgrade if anything
    about their row state is unexpected — a regression for the common path to fix an uncommon one.
  • "Multiple match — specify which" isn't actionable without the disambiguation UX that the follow-up PR will design. Telling a GitHub webhook to "specify which" has
    no recipient.

What I'd suggest as a cheap intermediate: a log.warn (not error) inside findCodebaseByRepoUrl / the in-memory findCodebaseByName helper when rows.length > 1,
including the matched IDs and the chosen one. That makes the ambiguity visible in logs the moment it first happens, without changing behavior — which is the right
preparatory step for the follow-up.

  1. Any thought on the multi-project-base-folder direction? Is this something you'd want to tackle as a follow-up after this lands, or is it adjacent enough that someone else should pick it up?

I literally have this situation on my machine right now. I would characterize this as a monorepo where the top level folder is not under git at all. It is just a collection of related modules that need to be considered in toto to understand the architecture. A related subtle issue is when the top level 'project' folder IS under git, but has no remote (local-only).

However, a few things need design first:

  • How a "base folder" is registered (single config entry? auto-discovery on register? explicit archon register-folder?).
  • How invocations target one-vs-all (per-platform picker semantics, fan-out behavior, run grouping).
  • How worktree/git operations behave when "all services" is selected and one repo has uncommitted changes.

I'd land routing disambiguation first (it's the immediate user-facing issue) and treat multi-project as a separate design discussion after. Happy to file the issue and sketch a strawman, but I don't want to scope-creep this PR or the immediate follow-up. Or you can do it. Let me know!

  1. Anything about the findCodebaseByName fallback in registerRepoAtPath (the name-only branch after composite miss, for managed→local upgrade) that would be better done differently given the N-clone future? E.g. restrict the upgrade to only fire when the name-match's default_cwd is inside getArchonWorkspacesPath() AND no other non-managed rows with the same name exist?

Yes — your suggested guard is the right one and I'd prefer to apply it in this PR rather than ship a known-fragile path. Specifically I'd tighten the upgrade branch
(clone.ts:122-167) to require both:

  1. The name-match's default_cwd is inside getArchonWorkspacesPath() (use the helper instead of the current .includes('/.archon/workspaces/') substring check — more robust against unusual ARCHON_HOME values and Windows paths).
  2. There are no other rows with the same name whose default_cwd is not managed. If any non-managed rows exist for this name, the user has already declared "I treat these as distinct," so silently rebinding the managed row to a new local path would violate that intent.

Without (2), this sequence is broken:

  • User registers coleam00/Archon via the managed clone path → managed row created.
  • User registers ~/dev/Archon as a distinct clone → local row created (composite key allows it now, ✅).
  • User later registers ~/prod/Archon → name-only match hits, but which row? Today it picks one via created_at DESC and silently rebinds it. With (2), it falls through to createCodebase and a third distinct row is created — which matches user intent.

Implementation is a single extra query (findCodebasesByName returning all matches, count the non-managed ones) and one added condition. Happy to push that as a fixup commit on this PR if you'd like, or leave the upgrade branch alone and let it become a no-op once we add a getArchonWorkspacesPath()-anchored check in the routing-disambiguation PR — your call

Happy to file the routing-disambiguation + multi-project-folder follow-ups as separate issues after we align here, and merge this PR once your answers are in. Really solid work — the fail-fast migration and the QA-round commits in particular.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multiple local clones of the same remote cannot be registered as separate projects

3 participants