Skip to content

feat(workflows): symmetric fallback chain — workspace + global tiers for scripts/commands#1144

Closed
dhcdata wants to merge 6 commits intocoleam00:devfrom
dhcdata:feature/workspace-tier
Closed

feat(workflows): symmetric fallback chain — workspace + global tiers for scripts/commands#1144
dhcdata wants to merge 6 commits intocoleam00:devfrom
dhcdata:feature/workspace-tier

Conversation

@dhcdata
Copy link
Copy Markdown

@dhcdata dhcdata commented Apr 12, 2026

Summary

  • Problem: Workflows already fall back from <cwd>/.archon/workflows/ to ~/.archon/.archon/workflows/ (user-global), but commands and scripts are repo-local only — there's no way to keep a command library or named script outside a specific repo.
  • Why it matters: (a) Cross-project automations can't live outside the repo. (b) Per-project automations that a team doesn't want committed have no home at all — gitignoring .archon/ breaks worktree-based runs because git worktree add only copies tracked files.
  • What changed: Symmetric three-tier fallback for scripts/commands/workflows (repo > workspace > user-global > bundled), plus a new $WORKSPACE_ARCHON_DIR substitution variable so bash/script nodes can reference workspace-tier helpers by absolute path.
  • What did NOT change: No CLI flag changes, no schema changes, no DB migrations, no changes to @archon/isolation / @archon/core / @archon/server / @archon/web. All existing workflows see identical behavior.

The fallback chain

Before this PR, only workflows had a user-global fallback. Commands and scripts were repo-local only, and there was no project-scoped tier at all.

Tier Location Scope Existed before
1 — Repo-local <cwd>/.archon/{workflows,commands,scripts}/ Team-shared, committed ✅ all three
2 — Workspace-in-userspace (new) ~/.archon/workspaces/<owner>/<repo>/.archon/{workflows,commands,scripts}/ Per-project, per-user ❌ none
3 — User-global ~/.archon/.archon/{workflows,commands,scripts}/ Per-user, applies to every project 🟡 workflows only
4 — Bundled defaults App defaults (filesystem in Bun, embedded in binary) Ship-with-app ✅ all three

Precedence: most specific wins. Repo beats workspace; workspace beats user-global; user-global beats bundled. Matches VSCode's workspace-settings-over-user-settings model and git-config precedence.

Tier 2 slots into the existing directory layout Archon already uses at ~/.archon/workspaces/<owner>/<repo>/ (which today houses source/, worktrees/, artifacts/, logs/). Owner/repo is derived from the git origin remote of cwd; if the derivation fails (bare dir, no remote, git missing), the workspace tier is silently skipped and two-tier behavior is preserved.

Motivating use case

I hit this gap building a per-project automation against a team repo where .archon/ is not welcome. Today the three available options are all bad:

  1. Commit to the team repo → teammates see it, team doesn't want it.
  2. .gitignore it in the team repogit worktree add only copies tracked files, so worktree-based runs (--branch foo) fail with "workflow not found."
  3. Drop it in ~/.archon/.archon/ → applies to every project, pollutes unrelated codebases.

Tier 2 is the missing option: per-project, per-user, zero team-repo footprint.

Resolution flow

flowchart LR
    A[archon workflow run X] --> B{Tier 1<br/>repo-local<br/>&lt;cwd&gt;/.archon/X}
    B -- hit --> Z[run X]
    B -- miss --> C{Tier 2 NEW<br/>workspace<br/>~/.archon/workspaces/O/R/.archon/X}
    C -- hit --> Z
    C -- miss --> D{Tier 3<br/>user-global<br/>~/.archon/.archon/X}
    D -- hit --> Z
    D -- miss --> E{Tier 4<br/>bundled defaults}
    E -- hit --> Z
    E -- miss --> F[not found]
Loading

Same chain applies to workflow discovery, loadCommandPrompt, and executeScriptNode. Before this PR: only workflow discovery had steps B→D→E→F; commands and scripts were B→F.

Architecture: files changed

flowchart TD
    subgraph CLI[packages/cli/src/commands/workflow.ts]
        CLI1[loadWorkflows] --> CLI2[resolveWorkspaceSearchPath ~NEW~]
    end

    subgraph GIT[packages/git/src/repo.ts]
        GIT1[parseOwnerRepoFromRemoteUrl ~NEW~ pure]
        GIT2[parseOwnerRepoFromGitRemote ~NEW~ async]
        GIT2 --> GIT1
    end

    subgraph PATHS[packages/paths/src/archon-paths.ts]
        PATHS1[getProjectArchonDir ~NEW~ one-liner]
    end

    subgraph WORKFLOWS[packages/workflows/src]
        WD[workflow-discovery<br/>~modified~ accepts workspaceSearchPath]
        ES[executor-shared::loadCommandPrompt<br/>~modified~ accepts workspaceSearchPath]
        DE[dag-executor::executeScriptNode<br/>~modified~ accepts workspaceSearchPath]
        SUB[substituteWorkflowVariables<br/>~modified~ emits $WORKSPACE_ARCHON_DIR]
    end

    CORE[packages/core/src/handlers/clone.ts<br/>~refactor~ drop inline parser]

    CLI2 --> GIT2
    CLI2 --> PATHS1
    CLI1 --> WD
    CORE --> GIT1
    DE --> SUB
    ES --> SUB
Loading

Connection inventory:

From To Status Notes
cli/commands/workflow @archon/git::parseOwnerRepoFromGitRemote new derives owner/repo once per run
cli/commands/workflow @archon/paths::getProjectRoot new builds workspace search path
cli/commands/workflow discoverWorkflowsWithConfig modified extra workspaceSearchPath option
workflow-discovery fs access() modified probes one additional directory
executor-shared::loadCommandPrompt fs readFile modified probes one additional directory
dag-executor::executeScriptNode discoverScripts modified probes one additional directory
core/handlers/clone.ts @archon/git::parseOwnerRepoFromRemoteUrl new replaces inline URL parser
executor-shared::substituteWorkflowVariables $WORKSPACE_ARCHON_DIR emission new mirrors $ARTIFACTS_DIR / $BASE_BRANCH

$WORKSPACE_ARCHON_DIR — why the variable is needed

The tier-3 fallback covers script/command/workflow resolution, but bash: and script: nodes that wrap bun run on named helper scripts via relative paths don't benefit — the Archon-managed worktree may not contain .archon/ at all. Example from the giraffe automation that exposed this gap:

# Before — breaks when .archon/ is outside the team repo:
- id: save-plan
  bash: |
    PLAN_OUTPUT=$plan-cmd.output bun run .archon/scripts/save-plan.ts

# After — absolute path via the new variable:
- id: save-plan
  bash: |
    PLAN_OUTPUT=$plan-cmd.output bun run "$WORKSPACE_ARCHON_DIR/scripts/save-plan.ts"

The variable resolves to ~/.archon/workspaces/<owner>/<repo>/.archon when the workspace tier is available, and is an empty string otherwise. Plumbed through executeBashNode, executeScriptNode, executeLoopNode (both the prompt and until_bash), executeApprovalNode's on_reject prompt, and executeNodeInternal (via buildPromptWithContext).

Label Snapshot

  • Risk: risk: low
  • Size: size: M
  • Scope: workflows
  • Module: workflows:discovery, workflows:executor-shared, workflows:dag-executor, git:repo, paths:archon-paths, cli:workflow

Change Metadata

  • Change type: feature
  • Primary scope: workflows

Linked Issue

  • Closes: (none — gap I hit while building a personal automation against a team repo that doesn't want .archon/ committed)
  • Related: N/A
  • Depends on: N/A
  • Supersedes: N/A

Validation Evidence

bun run type-check   #
bun run lint         # ✓ (--max-warnings 0)
bun run format:check #
bun run test         #

Per-package test results after rebase on latest upstream/dev:

Package / file Pass New cases
@archon/git/git.test.ts 147 11 (parseOwnerRepoFromRemoteUrl x8, parseOwnerRepoFromGitRemote x3)
@archon/workflows/dag-executor.test.ts 150 1 (mock gains getArchonHome)
@archon/workflows/dag-executor-script-global.test.ts (new) 6 all new
@archon/workflows/executor-shared-command-global.test.ts (new) 9 all new
@archon/workflows/executor-shared.test.ts 37 2 ($WORKSPACE_ARCHON_DIR substitution hit/miss)
@archon/workflows/loader.test.ts 92 2 (workspace tier in workflow discovery)

End-to-end manual validation against a real Jira ticket in a private repo: moved a real .archon/ config (workflows + commands + scripts + bundled giraffe automation) from ~/.archon/.archon/ to ~/.archon/workspaces/HireArt/hireart_main/.archon/ and ran:

archon workflow run giraffe-tick --cwd ~/code/hireart/hireart --branch giraffe-EE-474

Logs show the expected three-tier behavior:

workflow.discovery             count=1  msg=workspace_workflows_loaded
cli.workflow                   path=.../worktrees/.../task-giraffe-ee-474  msg=worktree_reused
workflow.script-discovery      count=0  dir=.../worktrees/.../.archon/scripts
workflow.script-discovery      count=12 dir=.../.archon/workspaces/HireArt/hireart_main/.archon/scripts
save-plan                      Completed (42ms)   # via $WORKSPACE_ARCHON_DIR
Workflow completed successfully.

Security Impact

  • New permissions/capabilities? No
  • New external network calls? No
  • Secrets/tokens handling changed? No
  • File system access scope changed? Slightly: workflow/command/script loading probes two additional directories under ~/.archon/ per run. Reads only — no new write paths, no new privileged operations.

Compatibility / Migration

  • Backward compatible? Yes. New options are optional; omission preserves existing behavior. No config changes, no env changes, no schema changes, no migrations.
  • When the git-remote parser fails (bare dir, missing origin, git missing), the workspace tier is silently skipped — same graceful-degradation pattern as the existing user-global tier when ARCHON_HOME is unset.

Human Verification

Verified:

  • Moving a real .archon/ config from user-global to workspace-tier and running archon workflow run picked up the workflow from the new tier without changing invocation. Script discovery fell back to the workspace tier. A bash node using $WORKSPACE_ARCHON_DIR successfully invoked bun run on a workspace-tier helper from inside a worktree that had no .archon/ directory.
  • Worktree reuse semantics unchanged: --branch <name> pinned the same physical worktree across multiple ticks (worktree_reused log line on every run after the first).
  • parseOwnerRepoFromRemoteUrl via the clone.ts refactor: /clone produces the same owner/repo slug for registered codebases as before.
  • Negative case: running archon workflow run from a plain tmpdir with no git remote silently skips the workspace tier (no error, no warn-level log).

Edge cases:

  • HTTPS, SSH (git@host:), ssh:// URL forms
  • Trailing .git suffix
  • GitLab subgroup URLs (two-level slug: last two path segments)
  • Missing origin remote
  • Non-git directory
  • Empty remote URL
  • Test files excluded from tsc don't block validate even with type drift (noted and addressed in a small follow-up commit after rebase)

Not verified:

  • Binary-build path (I run from bun source). Should work since the new code only adds a fallback tier and doesn't touch bundled-defaults loading.

Side Effects / Blast Radius

  • Affected subsystems: @archon/workflows (workflow-discovery, executor-shared, dag-executor, variable substitution), @archon/git (new helpers), @archon/paths (one wrapper), @archon/cli/commands/workflow (one resolve call). @archon/core/handlers/clone.ts refactored to use the shared parser (functional parity with existing behavior).
  • Potential unintended effects: A user with files staged at ~/.archon/workspaces/<owner>/<repo>/.archon/ could start seeing those take priority over their user-global copies. Intentional; worth a changelog mention.
  • Early detection: New log events make it easy to see which tier a given resolution hit:
    • workspace_workflows_loaded
    • dag.script_loaded_workspace
    • command_loaded_workspace
    • dag.workspace_search_path_resolved
    • dag.workspace_search_path_unavailable

Rollback Plan

  • Fast rollback: Revert the 5 commits on this branch. No data migrations to undo. Users who had files under ~/.archon/workspaces/.../.archon/ would see those files stop being discovered; files remain on disk untouched.
  • Feature flags: None. The workspace tier is always on when the git remote can be parsed. A kill switch could be added via config if there's demand, but that felt like YAGNI for a read-only fallback.
  • Observable failure symptoms: workflow.discovery errors, "not found" messages mentioning additional search paths, test failures in any of the 4 new / updated test files.

Risks and Mitigations

  • Risk: parseOwnerRepoFromGitRemote misparses an exotic remote URL and routes files to the wrong directory.

    • Mitigation: 11 unit tests cover HTTPS, SSH, ssh://, trailing .git, GitLab subgroups, empty input, single-segment input, trailing slash, real git init happy path, no-origin case, and non-repo case. Returns null on any failure; callers silently skip the tier.
  • Risk: Concurrent workflow runs in different directories re-run the git-remote lookup every time (no cross-run cache).

    • Mitigation: The helper is idempotent. Its result is memoized at the executeDagWorkflow scope (once per run, not once per node). The CLI layer also computes it once per invocation. No shared state.
  • Risk: A user who genuinely wants a workspace-tier file to shadow a repo-local file will be confused by "repo wins" precedence.

    • Mitigation: Precedence is documented in the discovery modules and matches VSCode/git conventions. Inverting it would surprise a larger audience.

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Added workspace-tier configuration support: scripts and commands can now be organized across repo-local, workspace, and user-global levels
    • Implemented three-tier fallback resolution: scripts and commands now resolve repo-local first, then workspace, then user-global
    • Added $WORKSPACE_ARCHON_DIR variable for workflows to reference workspace configuration paths
  • Tests

    • Added comprehensive test coverage for workspace-tier script and command resolution behavior

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 21c16cad-63f4-4ac4-861f-13b49b5695ac

📥 Commits

Reviewing files that changed from the base of the PR and between c321239 and 7faf57d.

📒 Files selected for processing (2)
  • packages/git/src/git.test.ts
  • packages/git/src/repo.ts
✅ Files skipped from review due to trivial changes (1)
  • packages/git/src/repo.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/git/src/git.test.ts

📝 Walkthrough

Walkthrough

Adds a workspace-level ".archon" search tier (repo → workspace → global) across workflow discovery, command/script resolution, and execution; introduces git-remote parsing helpers to derive workspace paths; and threads an optional workspaceSearchPath from the CLI through discovery and executor call paths.

Changes

Cohort / File(s) Summary
Git Remote Parsing
packages/git/src/repo.ts, packages/git/src/index.ts, packages/git/src/git.test.ts
Add parseOwnerRepoFromRemoteUrl (pure) and parseOwnerRepoFromGitRemote (async wrapper calling git remote get-url origin), re-export them, and add thorough tests validating formats and defensive checks.
Project Path Helpers
packages/paths/src/archon-paths.ts, packages/paths/src/index.ts
Add getProjectArchonDir(owner, repo) and re-export it to compute per-project .archon directories under user archon workspaces.
CLI Workflow Command
packages/cli/src/commands/workflow.ts
Add resolveWorkspaceSearchPath(cwd) to parse git remote owner/repo and compute a workspace search path; pass workspaceSearchPath into workflow loading.
Repository Registration
packages/core/src/handlers/clone.ts
Replace inline remote-url parsing with parseOwnerRepoFromRemoteUrl, preserving fallbacks when parsing returns null.
Workflow Discovery
packages/workflows/src/workflow-discovery.ts, packages/workflows/src/loader.test.ts
Extend discoverWorkflows/discoverWorkflowsWithConfig to accept workspaceSearchPath; load workspace workflows before repo; mirror global-tier error handling; add tests for workspace-only and precedence cases.
DAG Executor & Node Handlers
packages/workflows/src/dag-executor.ts, packages/workflows/src/dag-executor.test.ts, packages/workflows/src/dag-executor-script-global.test.ts
Thread optional workspaceSearchPath through executeDagWorkflow and node handlers; add workspace-tier .archon/scripts lookup between repo and user-global, pass workspace archon dir into variable substitution, and add tests covering discovery and precedence.
Prompt & Command Loading
packages/workflows/src/executor-shared.ts, packages/workflows/src/executor-shared.test.ts, packages/workflows/src/executor-shared-command-global.test.ts
Add workspaceSearchPath to loadCommandPrompt; insert workspace-tier command lookup; extend substituteWorkflowVariables/buildPromptWithContext with workspaceArchonDir; add tests for precedence and error reasons.
Executor Substitution Tests
packages/workflows/src/executor-shared.test.ts
Add tests verifying $WORKSPACE_ARCHON_DIR substitution when present and when omitted.
Package Test Config
packages/workflows/package.json
Add two Bun test entries to package test script to run new global-script test files.

Sequence Diagram

sequenceDiagram
    participant CLI as CLI (workflow.ts)
    participant Git as Git Utils (repo.ts)
    participant Paths as Paths (getProjectRoot / getProjectArchonDir)
    participant Discovery as Workflow Discovery
    participant Executor as DAG Executor
    CLI->>Git: parseOwnerRepoFromGitRemote(cwd)
    Git->>Git: git remote get-url origin (with timeout) -> parseOwnerRepoFromRemoteUrl
    Git-->>CLI: {owner,repo} or null
    alt owner/repo returned
        CLI->>Paths: getProjectRoot(owner,repo)
        Paths-->>CLI: workspaceSearchPath
    else null
        CLI-->>CLI: workspaceSearchPath = undefined
    end
    CLI->>Discovery: discoverWorkflowsWithConfig(cwd, { workspaceSearchPath, globalSearchPath })
    Discovery->>Discovery: load bundled/defaults -> load global -> load workspace (if provided) -> load repo (repo overrides)
    Discovery-->>CLI: merged workflows (repo > workspace > global)
    CLI->>Executor: executeDagWorkflow(..., workspaceSearchPath)
    Executor->>Executor: for each script node: try repo/.archon/scripts
    alt workspaceSearchPath provided
        Executor->>Executor: then try workspace/.archon/scripts
    end
    Executor->>Executor: finally try global getArchonHome()/.archon/scripts
    Executor-->>CLI: execution result (with source-tier info)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related issues

Possibly related PRs

Poem

🐰 I sniffed the remotes, found owner and repo,

I dug a new tier where workspace scripts grow.
Repo first, then workspace, then global at play—
Hopping through paths to make workflows obey! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(workflows): symmetric fallback chain — workspace + global tiers for scripts/commands' clearly and specifically summarizes the main feature: adding workspace and global tier fallback support for scripts and commands in workflows.
Description check ✅ Passed The PR description comprehensively covers all template sections: problem/motivation, what changed, what didn't change, UX flow diagrams, architecture diagrams with connection inventory, validation evidence, security impact, compatibility, human verification, side effects, rollback plan, and risks/mitigations.
Docstring Coverage ✅ Passed Docstring coverage is 88.46% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

dhcdata and others added 4 commits April 12, 2026 16:37
Workflows already resolve from ~/.archon/.archon/workflows/ when not
found in the repo (via globalSearchPath in discoverWorkflowsWithConfig),
but commands and scripts are still repo-local only. That gap blocks
anyone wanting to keep a user-level library of workflow building blocks
outside their team repo.

This adds a symmetric user-global tier:

- loadCommandPrompt (executor-shared.ts) searches <archonHome>/.archon/commands/
  after the repo loop, before falling through to app defaults.
- executeScriptNode (dag-executor.ts) probes <archonHome>/.archon/scripts/
  when the repo-local script lookup misses.
- Not-found error messages updated to list global paths.

Precedence is repo → global → bundled defaults, matching the existing
workflow-discovery order. No new flags, no schema changes.

Tests live in two new files (in their own bun test batches to keep
@archon/paths mocks isolated) covering: global-only hit, repo wins when
both exist, empty global command file, and missing-from-both.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…kflows

Adds a third resolution tier between repo-local and user-global, keyed by
the `{owner}/{repo}` of the project Archon is working against:

  1. repo-local       <cwd>/.archon/
  2. workspace (NEW)  ~/.archon/workspaces/<owner>/<repo>/.archon/
  3. user-global      ~/.archon/.archon/
  4. bundled defaults

Most specific wins, matching VSCode and git-config precedence. Unlocks
per-project automations that the team repo doesn't want committed but
that shouldn't apply to every project either.

Surgical changes:
- @archon/git: new `parseOwnerRepoFromRemoteUrl` (pure) and
  `parseOwnerRepoFromGitRemote` (async) helpers. Refactors the inline
  parser in core/handlers/clone.ts to share the pure helper.
- @archon/paths: one-line `getProjectArchonDir(owner, repo)` wrapper on
  top of the existing `getProjectRoot`.
- workflow-discovery: new optional `workspaceSearchPath` option, probed
  between global and repo tiers. Overrides by filename.
- executor-shared loadCommandPrompt: new optional `workspaceArchonDir`
  parameter, probed between repo and global loops.
- dag-executor executeScriptNode + executeNodeInternal: three-tier
  script lookup; `executeDagWorkflow` resolves the workspace dir once
  per run via `parseOwnerRepoFromGitRemote(cwd)` when the caller
  doesn't pre-compute it (silent null when no origin remote).
- CLI: computes `workspaceSearchPath` once before workflow discovery
  and forwards it to `discoverWorkflowsWithConfig`.

When owner/repo derivation fails (bare dir, no origin remote, git
missing), the workspace tier is silently skipped — two-tier behavior
is preserved.

Tests:
- @archon/git: pure parser covers HTTPS, SSH, ssh://, trailing .git,
  GitLab subgroups, empty/invalid input. Async wrapper tested against
  real tmpdir git init + remote add, plus no-origin and non-repo cases.
- executor-shared-command-global.test.ts: 5 new workspace-tier cases
  (ws-only hit, repo>workspace, workspace>global, omit preserves
  two-tier, not-found lists workspace paths).
- dag-executor-script-global.test.ts: 3 new cases mirroring the above
  for script nodes. Existing not-found message assertion updated to
  "(repo, workspace, or global)".
- loader.test.ts: 2 new cases for workflow-tier workspace search path.

Full `bun run validate` green.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The initial tier-3 implementation used two inconsistent conventions for
what the workspace-tier path param should contain:

- Workflow discovery expected a PROJECT ROOT and joined `.archon/workflows`
  onto it (matching how `globalSearchPath = getArchonHome()` works).
- `loadCommandPrompt` / `executeScriptNode` expected the `.archon` DIR
  directly and used a brittle `folder.replace(/^\.archon\//, '')` hack.

This made the CLI's `getProjectArchonDir(owner, repo)` resolve to
`~/.archon/workspaces/<owner>/<repo>/.archon` which then got joined with
`.archon/workflows` → `~/.archon/workspaces/<owner>/<repo>/.archon/.archon/workflows/`.
The double `.archon` made workflow discovery silently miss, which surfaced
as "Workflow 'giraffe-tick' not found" in end-to-end testing.

Unify on the project-root convention everywhere:

- CLI calls `getProjectRoot(owner, repo)` (no trailing `.archon`)
- `loadCommandPrompt` joins `workspaceSearchPath + folder + ${name}.md`
- `executeScriptNode` resolves `workspaceSearchPath + '.archon/scripts'`
- `executeDagWorkflow`'s internal fallback uses `getProjectRoot` and
  renames the internal var to `resolvedWorkspaceSearchPath` for clarity.

All parameter names change from `workspaceArchonDir` to `workspaceSearchPath`
to reinforce the new semantic (project root, not the `.archon` dir inside it).

`getProjectArchonDir` is left in place in `@archon/paths` — no longer used
internally, but kept as a public helper in case users want it for their own
tooling, and to preserve API reversibility.

Test fixtures updated to create files at `<tmpdir>/.archon/{commands,scripts}/`
instead of `<tmpdir>/{commands,scripts}/`. No new test cases — the existing
five command-tier and three script-tier cases cover the behavior.

Verified end-to-end: moving the giraffe config from `~/.archon/.archon/`
to `~/.archon/workspaces/HireArt/hireart_main/.archon/` and running
`archon workflow run giraffe-tick --cwd ~/code/hireart/hireart --branch
giraffe-EE-474` now discovers the workflow via the workspace tier and
script loading probes the workspace path correctly:

  workflow.discovery count=1 msg=workspace_workflows_loaded
  workflow.script-discovery count=12 dir=/Users/cjhowe/.archon/workspaces/HireArt/hireart_main/.archon/scripts

Full `bun run validate` green.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a workflow lives in the workspace-in-userspace tier
(~/.archon/workspaces/<owner>/<repo>/.archon/), `bash:` and `script:` nodes
that wrap `bun run` on named helper scripts can't find them via relative
paths — the worktree they execute in may not contain `.archon/` at all.
The tier-3 fallback covers script-name resolution but not shell-level
path references inside bash scripts.

Expose the resolved workspace `.archon` directory as a substitution
variable, mirroring how $ARTIFACTS_DIR and $BASE_BRANCH work:

- bash/script nodes can now write:
    bun run "$WORKSPACE_ARCHON_DIR/scripts/my-helper.ts"
- Empty string when the workspace tier isn't available (bare dir,
  no origin remote), so existing workflows that never use the
  variable are unaffected.

Changes:
- substituteWorkflowVariables gains an optional workspaceArchonDir param;
  buildPromptWithContext forwards it.
- executeBashNode, executeScriptNode, executeLoopNode (both prompt and
  until_bash call sites), and executeApprovalNode's on_reject prompt
  now derive `resolve(workspaceSearchPath, '.archon')` and pass it
  through. executeNodeInternal does the same via buildPromptWithContext.
- Two new unit tests: variable substitution with and without the param.

Caught during end-to-end testing of the giraffe automation against a
real Jira ticket — the save-plan bash wrapper failed to find
.archon/scripts/save-plan.ts because the Archon-managed worktree at
~/.archon/worktrees/.../task-giraffe-ee-474/ has no .archon/ directory
(the user's team repo intentionally keeps .archon/ out of the tracked
tree). The new variable lets the wrapper reference the workspace-tier
scripts via an absolute path.

Full `bun run validate` green.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/cli/src/commands/workflow.ts`:
- Around line 126-130: The resolveWorkspaceSearchPath function currently
forwards owner/repo from git.parseOwnerRepoFromGitRemote into getProjectRoot
without sanitization; update resolveWorkspaceSearchPath to validate that owner
and repo consist only of safe path segments (e.g., no "..", no path separators,
and match expected regex like /^[A-Za-z0-9._-]+$/) and only call
getProjectRoot(owner, repo) when both are valid; if validation fails, return
undefined to skip the workspace tier. Ensure you reference
resolveWorkspaceSearchPath, git.parseOwnerRepoFromGitRemote, and getProjectRoot
when making the change.

In `@packages/core/src/handlers/clone.ts`:
- Around line 388-396: The parsed owner/repo from
parseOwnerRepoFromRemoteUrl(remoteUrl) must be validated before assigning
name/ownerName and before calling ensureProjectStructure() or
getProjectSourcePath(); update the logic around ownerName/name to reject or
sanitize any parsed.owner or parsed.repo that contain dot-segments (".."), path
separators ("/" or "\"), absolute roots, empty strings, or other unsafe
characters (control chars), and if validation fails fall back to the safe
defaults (e.g., keep ownerName='_local' and name=repoName) so no untrusted token
can escape the workspace; apply this check where name and ownerName are set and
before passing ownerName into ensureProjectStructure and getProjectSourcePath.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1c0eb165-5a17-4988-a46d-14317fdc24ad

📥 Commits

Reviewing files that changed from the base of the PR and between eb75ab6 and 22b5c86.

📒 Files selected for processing (16)
  • packages/cli/src/commands/workflow.ts
  • packages/core/src/handlers/clone.ts
  • packages/git/src/git.test.ts
  • packages/git/src/index.ts
  • packages/git/src/repo.ts
  • packages/paths/src/archon-paths.ts
  • packages/paths/src/index.ts
  • packages/workflows/package.json
  • packages/workflows/src/dag-executor-script-global.test.ts
  • packages/workflows/src/dag-executor.test.ts
  • packages/workflows/src/dag-executor.ts
  • packages/workflows/src/executor-shared-command-global.test.ts
  • packages/workflows/src/executor-shared.test.ts
  • packages/workflows/src/executor-shared.ts
  • packages/workflows/src/loader.test.ts
  • packages/workflows/src/workflow-discovery.ts

Comment on lines +126 to +130
async function resolveWorkspaceSearchPath(cwd: string): Promise<string | undefined> {
const ownerRepo = await git.parseOwnerRepoFromGitRemote(cwd);
if (!ownerRepo) return undefined;
return getProjectRoot(ownerRepo.owner, ownerRepo.repo);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Harden workspace path resolution against unsafe remote segments.

Line 127-129 uses remote-derived owner/repo directly in getProjectRoot(). If origin contains malformed segments (e.g., ..), discovery can probe outside the intended workspace subtree. Validate/sanitize first, then skip workspace tier when invalid.

🛡️ Proposed fix
-import { createLogger, getArchonHome, getProjectRoot } from '@archon/paths';
+import { createLogger, getArchonHome, getProjectRoot, parseOwnerRepo } from '@archon/paths';
@@
 async function resolveWorkspaceSearchPath(cwd: string): Promise<string | undefined> {
   const ownerRepo = await git.parseOwnerRepoFromGitRemote(cwd);
   if (!ownerRepo) return undefined;
-  return getProjectRoot(ownerRepo.owner, ownerRepo.repo);
+  const safe = parseOwnerRepo(`${ownerRepo.owner}/${ownerRepo.repo.replace(/\.git$/, '')}`);
+  if (!safe) return undefined;
+  return getProjectRoot(safe.owner, safe.repo);
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/cli/src/commands/workflow.ts` around lines 126 - 130, The
resolveWorkspaceSearchPath function currently forwards owner/repo from
git.parseOwnerRepoFromGitRemote into getProjectRoot without sanitization; update
resolveWorkspaceSearchPath to validate that owner and repo consist only of safe
path segments (e.g., no "..", no path separators, and match expected regex like
/^[A-Za-z0-9._-]+$/) and only call getProjectRoot(owner, repo) when both are
valid; if validation fails, return undefined to skip the workspace tier. Ensure
you reference resolveWorkspaceSearchPath, git.parseOwnerRepoFromGitRemote, and
getProjectRoot when making the change.

Comment on lines +388 to 396
// Try to build owner/repo name from remote URL via the shared parser
let name = repoName;
let ownerName = '_local';
if (remoteUrl) {
const cleaned = remoteUrl.replace(/\.git$/, '').replace(/\/+$/, '');
let workingRemote = cleaned;
if (cleaned.startsWith('git@github.com:')) {
workingRemote = cleaned.replace('git@github.com:', 'https://github.com/');
}
const parts = workingRemote.split('/');
const r = parts.pop();
const o = parts.pop();
if (o && r) {
name = `${o}/${r}`;
ownerName = o;
const parsed = parseOwnerRepoFromRemoteUrl(remoteUrl);
if (parsed) {
name = `${parsed.owner}/${parsed.repo}`;
ownerName = parsed.owner;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Validate parsed remote owner/repo before using them for project paths.

Line 392-396 trusts parseOwnerRepoFromRemoteUrl(remoteUrl) output and assigns ownerName, which is then used for ensureProjectStructure()/getProjectSourcePath(). A malformed remote (.., dot segments, or other unsafe tokens) can push path creation outside the intended workspace tree.

🔒 Proposed fix
-  if (remoteUrl) {
-    const parsed = parseOwnerRepoFromRemoteUrl(remoteUrl);
-    if (parsed) {
-      name = `${parsed.owner}/${parsed.repo}`;
-      ownerName = parsed.owner;
-    }
-  }
+  if (remoteUrl) {
+    const parsedRemote = parseOwnerRepoFromRemoteUrl(remoteUrl);
+    if (parsedRemote) {
+      const normalizedRepo = parsedRemote.repo.replace(/\.git$/, '');
+      const safe = parseOwnerRepo(`${parsedRemote.owner}/${normalizedRepo}`);
+      if (safe) {
+        name = `${safe.owner}/${safe.repo}`;
+        ownerName = safe.owner;
+      } else {
+        getLog().warn({ remoteUrl }, 'remote_owner_repo_invalid');
+      }
+    }
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/handlers/clone.ts` around lines 388 - 396, The parsed
owner/repo from parseOwnerRepoFromRemoteUrl(remoteUrl) must be validated before
assigning name/ownerName and before calling ensureProjectStructure() or
getProjectSourcePath(); update the logic around ownerName/name to reject or
sanitize any parsed.owner or parsed.repo that contain dot-segments (".."), path
separators ("/" or "\"), absolute roots, empty strings, or other unsafe
characters (control chars), and if validation fails fall back to the safe
defaults (e.g., keep ownerName='_local' and name=repoName) so no untrusted token
can escape the workspace; apply this check where name and ownerName are set and
before passing ownerName into ensureProjectStructure and getProjectSourcePath.

Post-rebase cleanup: dag-executor-script-global.test.ts was introduced on
an older dev and its mock used the pre-rename interface. Upstream has
since renamed IAssistantClient → IAgentProvider (coleam00#1116). This test file
is excluded from tsc by packages/workflows/tsconfig.json and its runtime
never exercises the AI dispatch path (script nodes use bun subprocess,
not deps.getAgentProvider), so the stale name was benign — but rename
it anyway so the file reads consistently with the rest of the workflows
test suite.

No functional change. `bun run validate` green.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@dhcdata dhcdata force-pushed the feature/workspace-tier branch from 22b5c86 to c321239 Compare April 12, 2026 21:43
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
packages/cli/src/commands/workflow.ts (1)

126-129: ⚠️ Potential issue | 🔴 Critical

Validate remote-derived owner/repo before calling getProjectRoot().

This still trusts parseOwnerRepoFromGitRemote() as safe path input. A malformed remote can make the workspace tier probe outside ~/.archon/workspaces/... and load workflows from an unrelated local directory. Please sanitize/normalize owner and repo first and skip the workspace tier when either segment is invalid.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/cli/src/commands/workflow.ts` around lines 126 - 129,
resolveWorkspaceSearchPath trusts parseOwnerRepoFromGitRemote output and passes
owner/repo to getProjectRoot; validate and normalize both segments (e.g. reject
or sanitize any characters like path separators, .., or absolute paths, and
allow only the expected pattern such as [A-Za-z0-9._-]+) before calling
getProjectRoot(owner, repo). If either owner or repo fails validation, return
undefined (skip workspace tier). Reference resolveWorkspaceSearchPath,
parseOwnerRepoFromGitRemote, and getProjectRoot when locating where to add the
validation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/workflows/src/dag-executor.ts`:
- Around line 2471-2475: The parsed ownerRepo from parseOwnerRepoFromGitRemote
must be validated before being passed to getProjectRoot: ensure ownerRepo.owner
and ownerRepo.repo are single path segments (no path separators like "/" or "\"
and not "." or "..") and reject any containing traversal or separator
characters; if validation fails, do not set resolvedWorkspaceSearchPath (leave
undefined or fall back to the existing behavior) and log a debug/warn via
getLog() showing the rejected remote. Add this check immediately after obtaining
ownerRepo in the resolvedWorkspaceSearchPath branch (before calling
getProjectRoot) so only sanitized owner/repo values are used to construct
workspaceSearchPath.
- Around line 1564-1607: The workspace/global discovery catch blocks currently
just warn and continue, which allows lower-precedence scripts to run; modify the
catch handling in the workspace discovery (around
workspaceScriptsDir/discoverScripts/finalScript) and the global discovery
(around globalScriptsDir/discoverScripts/finalScript) so that if discoveryErr is
not an ENOENT (i.e., it's a real unreadable/broken error) you fail the node
immediately instead of falling through: detect non-ENOENT errors from
discoverScripts, log them with getLog().error (including nodeId and the path),
and return/raise a node failure (propagate an error or set the node result to
failed) so the DAG executor stops considering lower tiers for that node rather
than continuing silently.

In `@packages/workflows/src/workflow-discovery.ts`:
- Around line 222-247: The current catch for access(workspaceWorkflowPath) only
logs non-ENOENT errors and continues, which silently breaks precedence; instead,
when err.code !== 'ENOENT' push a structured read error into allErrors (e.g.
allErrors.push({ type: 'read_error', source: 'workspace', path:
workspaceWorkflowPath, error: err })) and then either abort resolution by
re-throwing the error or return early to prevent using lower-priority workflows;
update the catch block around access/workflow loading (symbols: access,
workspaceWorkflowPath, loadWorkflowsFromDir, workflowsByFile, allErrors) to
include this new error entry and stop further processing on non-ENOENT failures.

---

Duplicate comments:
In `@packages/cli/src/commands/workflow.ts`:
- Around line 126-129: resolveWorkspaceSearchPath trusts
parseOwnerRepoFromGitRemote output and passes owner/repo to getProjectRoot;
validate and normalize both segments (e.g. reject or sanitize any characters
like path separators, .., or absolute paths, and allow only the expected pattern
such as [A-Za-z0-9._-]+) before calling getProjectRoot(owner, repo). If either
owner or repo fails validation, return undefined (skip workspace tier).
Reference resolveWorkspaceSearchPath, parseOwnerRepoFromGitRemote, and
getProjectRoot when locating where to add the validation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 75411410-a1fa-4fe9-8400-3e777d576d82

📥 Commits

Reviewing files that changed from the base of the PR and between 22b5c86 and c321239.

📒 Files selected for processing (16)
  • packages/cli/src/commands/workflow.ts
  • packages/core/src/handlers/clone.ts
  • packages/git/src/git.test.ts
  • packages/git/src/index.ts
  • packages/git/src/repo.ts
  • packages/paths/src/archon-paths.ts
  • packages/paths/src/index.ts
  • packages/workflows/package.json
  • packages/workflows/src/dag-executor-script-global.test.ts
  • packages/workflows/src/dag-executor.test.ts
  • packages/workflows/src/dag-executor.ts
  • packages/workflows/src/executor-shared-command-global.test.ts
  • packages/workflows/src/executor-shared.test.ts
  • packages/workflows/src/executor-shared.ts
  • packages/workflows/src/loader.test.ts
  • packages/workflows/src/workflow-discovery.ts
✅ Files skipped from review due to trivial changes (9)
  • packages/workflows/package.json
  • packages/workflows/src/executor-shared.test.ts
  • packages/core/src/handlers/clone.ts
  • packages/paths/src/index.ts
  • packages/workflows/src/loader.test.ts
  • packages/git/src/git.test.ts
  • packages/workflows/src/executor-shared-command-global.test.ts
  • packages/workflows/src/dag-executor.test.ts
  • packages/paths/src/archon-paths.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/git/src/repo.ts
  • packages/workflows/src/dag-executor-script-global.test.ts

Comment on lines +1564 to +1607
// Tier 2: workspace-in-userspace (~/.archon/workspaces/<owner>/<repo>/.archon/scripts/)
if (!scriptDef && workspaceSearchPath) {
const workspaceScriptsDir = resolve(workspaceSearchPath, '.archon', 'scripts');
if (workspaceScriptsDir !== scriptsDir) {
try {
const workspaceScripts = await discoverScripts(workspaceScriptsDir);
scriptDef = workspaceScripts.get(finalScript);
if (scriptDef) {
getLog().debug(
{ nodeId: node.id, scriptName: finalScript, source: 'workspace' },
'dag.script_loaded_workspace'
);
}
} catch (discoveryErr) {
getLog().warn(
{ err: discoveryErr as Error, workspaceScriptsDir },
'dag.script_workspace_discovery_failed'
);
}
}
}

// Tier 3: user-global (~/.archon/.archon/scripts/)
if (!scriptDef) {
const errorMsg = `Script node '${node.id}': named script '${finalScript}' not found in .archon/scripts/`;
const globalScriptsDir = resolve(getArchonHome(), '.archon', 'scripts');
if (globalScriptsDir !== scriptsDir) {
try {
const globalScripts = await discoverScripts(globalScriptsDir);
scriptDef = globalScripts.get(finalScript);
if (scriptDef) {
getLog().debug(
{ nodeId: node.id, scriptName: finalScript, source: 'global' },
'dag.script_loaded_global'
);
}
} catch (discoveryErr) {
// discoverScripts swallows ENOENT; surface any other error as a warning
// but still fall through to the not-found path below.
getLog().warn(
{ err: discoveryErr as Error, globalScriptsDir },
'dag.script_global_discovery_failed'
);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail on workspace/global script discovery errors instead of falling through.

These catch blocks warn and continue, so a broken or unreadable workspace tier is treated as "no script here". That lets a lower-precedence script with the same name run from a later tier, which breaks the advertised precedence chain. Return a node failure for non-ENOENT discovery errors instead of continuing the search.

As per coding guidelines, "Prefer throwing early with clear errors for unsupported/unsafe states rather than silently swallowing errors; never silently broaden permissions or capabilities (Fail Fast + Explicit Errors)".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/workflows/src/dag-executor.ts` around lines 1564 - 1607, The
workspace/global discovery catch blocks currently just warn and continue, which
allows lower-precedence scripts to run; modify the catch handling in the
workspace discovery (around workspaceScriptsDir/discoverScripts/finalScript) and
the global discovery (around globalScriptsDir/discoverScripts/finalScript) so
that if discoveryErr is not an ENOENT (i.e., it's a real unreadable/broken
error) you fail the node immediately instead of falling through: detect
non-ENOENT errors from discoverScripts, log them with getLog().error (including
nodeId and the path), and return/raise a node failure (propagate an error or set
the node result to failed) so the DAG executor stops considering lower tiers for
that node rather than continuing silently.

Comment on lines +2471 to +2475
if (resolvedWorkspaceSearchPath === undefined) {
const ownerRepo = await parseOwnerRepoFromGitRemote(cwd);
if (ownerRepo) {
resolvedWorkspaceSearchPath = getProjectRoot(ownerRepo.owner, ownerRepo.repo);
getLog().debug(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Sanitize the parsed remote before turning it into workspaceSearchPath.

ownerRepo is used as trusted path input here and then threaded into command/script lookups later in this file. If the remote parser returns .. or separator-containing segments, getProjectRoot() can escape the workspace base and the executor may read or execute files from an arbitrary local .archon tree. Validate the pair first and drop the workspace tier when it is not a safe single-segment owner/repo.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/workflows/src/dag-executor.ts` around lines 2471 - 2475, The parsed
ownerRepo from parseOwnerRepoFromGitRemote must be validated before being passed
to getProjectRoot: ensure ownerRepo.owner and ownerRepo.repo are single path
segments (no path separators like "/" or "\" and not "." or "..") and reject any
containing traversal or separator characters; if validation fails, do not set
resolvedWorkspaceSearchPath (leave undefined or fall back to the existing
behavior) and log a debug/warn via getLog() showing the rejected remote. Add
this check immediately after obtaining ownerRepo in the
resolvedWorkspaceSearchPath branch (before calling getProjectRoot) so only
sanitized owner/repo values are used to construct workspaceSearchPath.

Comment on lines +222 to +247
if (options?.workspaceSearchPath) {
const [workflowFolderName] = archonPaths.getWorkflowFolderSearchPaths();
const workspaceWorkflowPath = join(options.workspaceSearchPath, workflowFolderName);
getLog().debug({ workspaceWorkflowPath }, 'searching_workspace_workflows');
try {
await access(workspaceWorkflowPath);
const workspaceResult = await loadWorkflowsFromDir(workspaceWorkflowPath);
for (const [filename, workflow] of workspaceResult.workflows) {
if (workflowsByFile.has(filename)) {
getLog().debug({ filename }, 'workspace_workflow_overrides_global');
}
// Same scope decision as global: classified as 'project' (not a separate
// 'workspace' source badge). Can be split later if the UI needs it.
workflowsByFile.set(filename, { workflow, source: 'project' });
}
allErrors.push(...workspaceResult.errors);
getLog().info({ count: workspaceResult.workflows.size }, 'workspace_workflows_loaded');
} catch (error) {
const err = error as NodeJS.ErrnoException;
if (err.code !== 'ENOENT') {
getLog().warn({ err, workspaceWorkflowPath }, 'workspace_workflows_access_error');
} else {
getLog().debug({ workspaceWorkflowPath }, 'workspace_workflows_not_found');
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't silently downgrade workspace access errors to "not found".

If access(workspaceWorkflowPath) fails with EACCES or another non-ENOENT error, this only logs a warning and keeps the already-loaded lower tiers active. That breaks the repo > workspace > global precedence contract and hides an actionable error from callers. Please add a read_error entry to allErrors here (or abort resolution) instead of continuing silently.

As per coding guidelines, "Prefer throwing early with clear errors for unsupported/unsafe states rather than silently swallowing errors; never silently broaden permissions or capabilities (Fail Fast + Explicit Errors)".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/workflows/src/workflow-discovery.ts` around lines 222 - 247, The
current catch for access(workspaceWorkflowPath) only logs non-ENOENT errors and
continues, which silently breaks precedence; instead, when err.code !== 'ENOENT'
push a structured read error into allErrors (e.g. allErrors.push({ type:
'read_error', source: 'workspace', path: workspaceWorkflowPath, error: err }))
and then either abort resolution by re-throwing the error or return early to
prevent using lower-priority workflows; update the catch block around
access/workflow loading (symbols: access, workspaceWorkflowPath,
loadWorkflowsFromDir, workflowsByFile, allErrors) to include this new error
entry and stop further processing on non-ENOENT failures.

…abbit feedback)

Addresses two CodeRabbit review threads on coleam00#1144 flagging that
parseOwnerRepoFromRemoteUrl returns owner/repo strings straight from the
origin remote without any validation. A crafted remote URL — e.g.
`git remote set-url origin git@host:../evil` — could cause the three
downstream call sites (CLI workspace-tier resolver, clone.ts project
structure creation, dag-executor workspace tier fallback) to construct
paths that escape ~/.archon/workspaces/<owner>/<repo>/.

Fix lives in the pure parser so all three call sites benefit:

- New `isSafePathSegment` helper — rejects empty strings, `.`, `..`, and
  anything outside `/^[A-Za-z0-9._-]+$/` (the GitHub/GitLab-compatible
  slug character set). Defense in depth: matches what both forges
  actually allow while explicitly excluding traversal markers.
- `parseOwnerRepoFromRemoteUrl` applies the check to both owner and
  repo before returning. Invalid input becomes `null`, which every
  existing caller already handles via graceful fallback:
    · cli/workflow.ts → returns undefined, skips workspace tier
    · dag-executor → logs `dag.workspace_search_path_unavailable`
    · core/handlers/clone.ts → falls back to `_local`/dirname

Eight new test cases cover the attack surface:
  · `..` as owner segment
  · `..` as repo segment
  · `.` (single-dot) segments in both positions
  · backslash injection
  · whitespace injection
  · shell metacharacters (`;`, `$`, `|`)
  · null bytes
  · positive case: `my-org/my_repo.js.git` still parses correctly

All 155 @archon/git tests pass; full `bun run validate` green.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@dhcdata
Copy link
Copy Markdown
Author

dhcdata commented Apr 12, 2026

Thanks @coderabbitai — both review threads are addressing the same real concern, and you were right to flag it. A crafted git remote set-url origin git@host:../evil would have let an attacker-controlled remote steer workspace path construction outside ~/.archon/workspaces/<owner>/<repo>/.

Fixed in 7faf57d — pushed a new commit on top of the branch (no force-push, history preserved).

What the fix does

Validation lives in the pure parser (parseOwnerRepoFromRemoteUrl) so all three call sites you flagged benefit from a single point of enforcement:

  • A new isSafePathSegment helper rejects empty strings, ., .., and anything outside /^[A-Za-z0-9._-]+$/ (the GitHub/GitLab-compatible slug character set).
  • parseOwnerRepoFromRemoteUrl applies the check to both owner and repo before returning. Invalid input becomes null.
  • Every existing caller already handles null via graceful fallback:
    • packages/cli/src/commands/workflow.ts:resolveWorkspaceSearchPath → returns undefined, workspace tier skipped
    • packages/workflows/src/dag-executor.ts:executeDagWorkflow → logs dag.workspace_search_path_unavailable
    • packages/core/src/handlers/clone.ts → falls back to _local/dirname

So adding validation at the parser layer auto-patches all three threads without each caller needing its own guard.

Test coverage

Eight new unit tests in packages/git/src/git.test.ts covering the attack surface:

Case Input Expected
.. as owner git@host:../evil.git null
.. as repo git@host:owner/.. null
. single-dot (owner) git@host:./repo null
. single-dot (repo) git@host:owner/. null
Backslash injection git@host:ow\ner/repo null
Whitespace injection git@host:own er/repo null
Shell metachars git@host:owner/re;po, $, | null
Null bytes git@host:owner/re\x00po null
Positive case (regression) git@host:my-org/my_repo.js.git {owner: 'my-org', repo: 'my_repo.js'}

155 @archon/git tests pass; full bun run validate green on the updated branch.

Note to any future maintainer: The choice of /^[A-Za-z0-9._-]+$/ matches the union of what GitHub and GitLab allow in user/org/repo slugs (alphanumerics plus -, _, .). If a future git forge supports other characters in path segments, extend the regex explicitly rather than loosening it — the whole point is that unknown characters are untrusted until proven safe.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 12, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check the current state of the fix in packages/git/src/repo.ts
cat packages/git/src/repo.ts

echo "---"
# Also check the new test cases
grep -n "isSafePathSegment\|path.traversal\|\.\.\/\|backslash\|whitespace\|null byte\|shell meta\|evil\|attack\|owner/\.\." packages/git/src/git.test.ts | head -60

Length of output: 15289


@dhcdata: The fix looks correct and well-structured. Here's a quick verification of the key implementation details:

isSafePathSegment analysis:

  • The explicit s === '.' and s === '..' pre-checks are not redundant — both . and .. do satisfy SAFE_SLUG_PATTERN (since . is in the character class), so these guards are necessary and correct.
  • The allowlist regex /^[A-Za-z0-9._-]+$/ is tight and appropriate. Shell metacharacters (;, $, |), path separators, null bytes, whitespace, and backslashes are all correctly rejected.
  • Segments like .hidden (dot-prefixed but not traversal) technically pass the regex. This is benign — neither GitHub nor GitLab permit org/repo names starting with a dot, so it can't be exploited in practice.

Single enforcement point works correctly:
The three call sites (workflow.ts:resolveWorkspaceSearchPath, dag-executor.ts:executeDagWorkflow, clone.ts) all already treat null as a graceful fallback, so the parser-layer validation patches all paths without requiring individual call-site guards. That's the right design choice.

Test coverage for the eight new cases in git.test.ts confirms all the attack vectors are blocked.

The fix is solid.


✏️ Learnings added
Learnt from: dhcdata
URL: https://github.com/coleam00/Archon/pull/1144

Timestamp: 2026-04-12T22:05:38.426Z
Learning: In `packages/git/src/repo.ts`, `parseOwnerRepoFromRemoteUrl` uses an `isSafePathSegment` helper with allowlist regex `/^[A-Za-z0-9._-]+$/` plus explicit rejection of `.` and `..` (the pre-checks are necessary because `.` and `..` both match the regex character class). Applied to owner and repo before returning, preventing path traversal via crafted git remote URLs into `~/.archon/workspaces/<owner>/<repo>/`.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-12T17:32:37.661Z
Learning: Applies to packages/adapters/src/forge/github/**/*.ts : GitHub mention detection: parse `archon` in issue/PR comments only (not descriptions); events: `issue_comment` only; descriptions often contain examples or documentation and are NOT command invocations

Learnt from: CR
Repo: coleam00/Archon PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-12T17:32:37.660Z
Learning: Applies to **/*.ts : Use `archon/git` functions for git operations; use `execFileAsync` (not `exec`) when calling git directly; never run `git clean -fd` (use `git checkout .` instead to avoid permanent deletion of untracked files)

@dhcdata
Copy link
Copy Markdown
Author

dhcdata commented Apr 12, 2026

Cross-reference: related PRs and issues

Wanted to flag three open items I found while reviewing related work on the repo. Two of them expose a real gap in this PR that I want to be transparent about.

#1105"Add ~/.archon/.archon/commands/ as a global command search path (mirrors global workflow behavior)" (@dancehands)

This issue asks for exactly what commit 1 of this PR delivers (tier 2 for commands/scripts). Worth closing with this PR if it lands.

One thing #1105 mentions that this PR does not fix: configuredFolder from commands.folder in .archon/config.yaml doesn't expand ~, so commands: folder: ~/my-commands silently fails. It's a one-line expandTilde() call in loadCommandPrompt — I'll fix it in a follow-up commit unless a reviewer would rather keep this PR focused and open it as a separate change.

#1138 / #1136"Global workflows invisible to webhook commands and web UI" (@jonasvanderhaegen)

@jonasvanderhaegen's #1138 identifies that 4 callers of discoverWorkflowsWithConfig don't pass globalSearchPath, so global workflows at ~/.archon/.archon/workflows/ are invisible to webhook chat commands (@archon /workflow run X) and the Web UI workflow list. #1136 fixes it in 13 surgical lines.

This PR has the same bug pattern for the new workspaceSearchPath option. I only plumbed it through one caller — cli/commands/workflow.ts. The other five still pass no options:

Caller globalSearchPath workspaceSearchPath (this PR)
cli/commands/workflow.ts ✅ upstream ✅ this PR
cli/commands/validate.ts:88
core/orchestrator/orchestrator-agent.ts:1330
core/orchestrator/orchestrator-agent.ts:384, 428 ✅ upstream
core/handlers/command-handler.ts:566, 612, 807 ❌ → #1136
server/routes/api.ts:1770 ❌ → #1136

In practice: a user running @archon /workflow run my-workspace-tier-workflow from a GitHub comment, opening the Web UI workflow list, or running archon validate workflows won't see workspace-tier workflows even with the file in place on disk. My giraffe end-to-end test passed because I only exercised the CLI caller — the gap was masked.

Proposed path forward

I think the cleanest ordering is:

  1. Let fix: pass globalSearchPath to command handler workflow discovery #1136 merge first as-is. It's a focused 13-line fix to a real bug that's orthogonal to the new tier introduced here.
  2. I'll rebase this PR over fix: pass globalSearchPath to command handler workflow discovery #1136 and add a follow-up commit that passes workspaceSearchPath to the same 4 call sites fix: pass globalSearchPath to command handler workflow discovery #1136 touches, plus validate.ts and orchestrator-agent.ts:1330 which fix: pass globalSearchPath to command handler workflow discovery #1136 doesn't reach.
  3. That commit will also fix the expandTilde(configuredFolder) gap from Add ~/.archon/.archon/commands/ as a global command search path (mirrors global workflow behavior) #1105.

Alternatively, if the maintainer would rather have one PR that solves all of it, I can supersede #1136: extend this PR to cover all 6 call sites (both globalSearchPath where missing and workspaceSearchPath everywhere) and leave @jonasvanderhaegen's PR to be closed with credit in the commit message. I'd prefer option 1 because #1136 is a textbook example of a small focused fix and shouldn't get bundled into a larger feature PR, but happy to do whatever the maintainer prefers.

#1117 (@joelsb) and #1104 (@truck0321)

Neither directly overlaps. #1117 asks a different question ("where do worktrees physically live?") and #1104 parses git remote URLs but for a different purpose (forge detection) — it could optionally reuse parseOwnerRepoFromRemoteUrl from this PR if #1104 lands second, but there's no merge-order conflict.


Let me know which path you'd prefer for the #1136 overlap and I'll post follow-up commits accordingly.

@Wirasm
Copy link
Copy Markdown
Collaborator

Wirasm commented Apr 17, 2026

Related to #1105 — overlapping area or partial fix.

@Wirasm
Copy link
Copy Markdown
Collaborator

Wirasm commented Apr 17, 2026

Related to #1138 — overlapping area or partial fix.

@Wirasm
Copy link
Copy Markdown
Collaborator

Wirasm commented Apr 21, 2026

Thanks for the thoughtful proposal, @dhcdata — closing because most of this is now shipped and the remaining novelty solves a problem Archon already solves via an existing primitive.

What's already in dev (overlaps with this PR)

PR #1315 landed the "user-global" tier for all three resource types, and also adjusts the path to be a direct child of ~/.archon/:

  • Commands: ~/.archon/commands/ (source label: global)
  • Scripts: ~/.archon/scripts/
  • Workflows: ~/.archon/workflows/ (moved from the old ~/.archon/.archon/workflows/ — a warn-once deprecation fires with the exact mv command if the old path is detected)
  • 1-level subfolder support for all three (e.g. ~/.archon/commands/triage/foo.md)
  • GET /api/workflows and GET /api/commands return source: 'project' | 'global' | 'bundled' for Web UI badging

This covers the "Commands and scripts are repo-local only — no way to keep a library outside a specific repo" problem.

Why the workspace-in-userspace tier isn't needed

Your PR body frames the core use case as:

Gitignoring .archon/ breaks worktree-based runs because git worktree add only copies tracked files, so worktree-based runs fail with "workflow not found."

This is exactly what the existing worktree.copyFiles primitive in .archon/config.yaml is designed to solve. After git worktree add creates the worktree, copyWorktreeFiles() performs a filesystem copy of named files/dirs from the canonical repo into the worktree — including gitignored content, because it's raw fs/promises cp({ recursive: true }), not a git operation.

Crucially, .archon/ is in the default copy list — see packages/isolation/src/providers/worktree.ts:830:

```ts
const defaultCopyFiles = ['.archon'];
```

So the "gitignored + worktree" flow already works end-to-end without any config:

  1. User gitignores .archon/ locally ✅
  2. Archon copies .archon/ into every new worktree automatically ✅
  3. Workflows/commands/scripts resolve from the worktree's <cwd>/.archon/ tier-1 ✅

For other gitignored resources (.env, .vscode, data fixtures), users extend via:

```yaml

.archon/config.yaml

worktree:
copyFiles:
- .env
- .vscode
- data/fixtures/
```

See packages/docs-web/src/content/docs/reference/configuration.md:122-134.

On the layering cost

A fourth resolution tier is a durable cost — every workflow author mentally maps <cwd>/.archon/ vs ~/.archon/workspaces/<owner>/<repo>/.archon/ vs ~/.archon/ vs bundled; debug paths multiply; docs have to explain the precedence forever. The worktree.copyFiles mechanism achieves the same outcome with one existing tier + one existing knob, and keeps the mental model flat.

Credits

The three things from this PR that are still worth landing on their own:

  1. parseOwnerRepoFromRemoteUrl / parseOwnerRepoFromGitRemote in @archon/git — genuine dedup opportunity vs. the inline parser in core/handlers/clone.ts. Feel free to open a small focused refactor PR for just this piece.
  2. The path-traversal test hardening you added to packages/git/src/git.test.ts — same: valuable as a small PR.
  3. $WORKSPACE_ARCHON_DIR — not needed without the workspace tier, but the general idea of "give bash/script nodes a stable absolute path to a persistent per-project helper dir" could be revisited if demand emerges (e.g. a workspace: section in config).

Thanks again for the careful writeup and diagrams — really appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants