Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion packages/cli/src/commands/workflow.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ import {
} from '@archon/core';
import { WORKFLOW_EVENT_TYPES, type WorkflowEventType } from '@archon/workflows/store';
import { configureIsolation, getIsolationProvider } from '@archon/isolation';
import { createLogger, getArchonHome } from '@archon/paths';
import { createLogger, getArchonHome, getProjectRoot } from '@archon/paths';
import { createWorkflowDeps } from '@archon/core/workflows/store-adapter';
import { discoverWorkflowsWithConfig } from '@archon/workflows/workflow-discovery';
import { resolveWorkflowName } from '@archon/workflows/router';
Expand Down Expand Up @@ -115,14 +115,30 @@ function renderWorkflowEvent(event: WorkflowEmitterEvent, verbose: boolean): voi
}
}

/**
* Compute the per-project workspace search path (~/.archon/workspaces/<owner>/<repo>)
* for a cwd, or undefined when the cwd has no recognizable origin remote.
* Silently skips the workspace tier in that case — two-tier behavior preserved.
*
* The returned path is the PROJECT ROOT, not the `.archon` subdir — matching
* how `globalSearchPath` works. Callers append `.archon/{workflows,commands,scripts}`.
*/
async function resolveWorkspaceSearchPath(cwd: string): Promise<string | undefined> {
const ownerRepo = await git.parseOwnerRepoFromGitRemote(cwd);
if (!ownerRepo) return undefined;
return getProjectRoot(ownerRepo.owner, ownerRepo.repo);
}
Comment on lines +126 to +130
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Harden workspace path resolution against unsafe remote segments.

Line 127-129 uses remote-derived owner/repo directly in getProjectRoot(). If origin contains malformed segments (e.g., ..), discovery can probe outside the intended workspace subtree. Validate/sanitize first, then skip workspace tier when invalid.

🛡️ Proposed fix
-import { createLogger, getArchonHome, getProjectRoot } from '@archon/paths';
+import { createLogger, getArchonHome, getProjectRoot, parseOwnerRepo } from '@archon/paths';
@@
 async function resolveWorkspaceSearchPath(cwd: string): Promise<string | undefined> {
   const ownerRepo = await git.parseOwnerRepoFromGitRemote(cwd);
   if (!ownerRepo) return undefined;
-  return getProjectRoot(ownerRepo.owner, ownerRepo.repo);
+  const safe = parseOwnerRepo(`${ownerRepo.owner}/${ownerRepo.repo.replace(/\.git$/, '')}`);
+  if (!safe) return undefined;
+  return getProjectRoot(safe.owner, safe.repo);
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/cli/src/commands/workflow.ts` around lines 126 - 130, The
resolveWorkspaceSearchPath function currently forwards owner/repo from
git.parseOwnerRepoFromGitRemote into getProjectRoot without sanitization; update
resolveWorkspaceSearchPath to validate that owner and repo consist only of safe
path segments (e.g., no "..", no path separators, and match expected regex like
/^[A-Za-z0-9._-]+$/) and only call getProjectRoot(owner, repo) when both are
valid; if validation fails, return undefined to skip the workspace tier. Ensure
you reference resolveWorkspaceSearchPath, git.parseOwnerRepoFromGitRemote, and
getProjectRoot when making the change.


/**
* Load workflows from cwd with standardized error handling.
* Returns the WorkflowLoadResult with both workflows and errors.
*/
async function loadWorkflows(cwd: string): Promise<WorkflowLoadResult> {
try {
const workspaceSearchPath = await resolveWorkspaceSearchPath(cwd);
return await discoverWorkflowsWithConfig(cwd, loadConfig, {
globalSearchPath: getArchonHome(),
...(workspaceSearchPath ? { workspaceSearchPath } : {}),
});
} catch (error) {
const err = error as Error;
Expand Down
19 changes: 6 additions & 13 deletions packages/core/src/handlers/clone.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ import { access, rm } from 'fs/promises';
import { join, basename, resolve } from 'path';
import * as codebaseDb from '../db/codebases';
import { sanitizeError } from '../utils/credential-sanitizer';
import { execFileAsync } from '@archon/git';
import { execFileAsync, parseOwnerRepoFromRemoteUrl } from '@archon/git';
import {
expandTilde,
getCommandFolderSearchPaths,
Expand Down Expand Up @@ -385,21 +385,14 @@ export async function registerRepository(
// Extract repo name from directory name
const repoName = basename(localPath);

// Try to build owner/repo name from remote URL
// Try to build owner/repo name from remote URL via the shared parser
let name = repoName;
let ownerName = '_local';
if (remoteUrl) {
const cleaned = remoteUrl.replace(/\.git$/, '').replace(/\/+$/, '');
let workingRemote = cleaned;
if (cleaned.startsWith('git@github.com:')) {
workingRemote = cleaned.replace('git@github.com:', 'https://github.com/');
}
const parts = workingRemote.split('/');
const r = parts.pop();
const o = parts.pop();
if (o && r) {
name = `${o}/${r}`;
ownerName = o;
const parsed = parseOwnerRepoFromRemoteUrl(remoteUrl);
if (parsed) {
name = `${parsed.owner}/${parsed.repo}`;
ownerName = parsed.owner;
}
Comment on lines +388 to 396
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Validate parsed remote owner/repo before using them for project paths.

Line 392-396 trusts parseOwnerRepoFromRemoteUrl(remoteUrl) output and assigns ownerName, which is then used for ensureProjectStructure()/getProjectSourcePath(). A malformed remote (.., dot segments, or other unsafe tokens) can push path creation outside the intended workspace tree.

🔒 Proposed fix
-  if (remoteUrl) {
-    const parsed = parseOwnerRepoFromRemoteUrl(remoteUrl);
-    if (parsed) {
-      name = `${parsed.owner}/${parsed.repo}`;
-      ownerName = parsed.owner;
-    }
-  }
+  if (remoteUrl) {
+    const parsedRemote = parseOwnerRepoFromRemoteUrl(remoteUrl);
+    if (parsedRemote) {
+      const normalizedRepo = parsedRemote.repo.replace(/\.git$/, '');
+      const safe = parseOwnerRepo(`${parsedRemote.owner}/${normalizedRepo}`);
+      if (safe) {
+        name = `${safe.owner}/${safe.repo}`;
+        ownerName = safe.owner;
+      } else {
+        getLog().warn({ remoteUrl }, 'remote_owner_repo_invalid');
+      }
+    }
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/handlers/clone.ts` around lines 388 - 396, The parsed
owner/repo from parseOwnerRepoFromRemoteUrl(remoteUrl) must be validated before
assigning name/ownerName and before calling ensureProjectStructure() or
getProjectSourcePath(); update the logic around ownerName/name to reject or
sanitize any parsed.owner or parsed.repo that contain dot-segments (".."), path
separators ("/" or "\"), absolute roots, empty strings, or other unsafe
characters (control chars), and if validation fails fall back to the safe
defaults (e.g., keep ownerName='_local' and name=repoName) so no untrusted token
can escape the workspace; apply this check where name and ownerName are set and
before passing ownerName into ensureProjectStructure and getProjectSourcePath.

}

Expand Down
146 changes: 146 additions & 0 deletions packages/git/src/git.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1894,4 +1894,150 @@ branch refs/heads/feature/auth
);
});
});

describe('parseOwnerRepoFromRemoteUrl (pure)', () => {
test('parses HTTPS GitHub URL', () => {
expect(git.parseOwnerRepoFromRemoteUrl('https://github.com/owner/repo')).toEqual({
owner: 'owner',
repo: 'repo',
});
});

test('parses HTTPS GitHub URL with trailing .git', () => {
expect(git.parseOwnerRepoFromRemoteUrl('https://github.com/owner/repo.git')).toEqual({
owner: 'owner',
repo: 'repo',
});
});

test('parses SSH GitHub URL', () => {
expect(git.parseOwnerRepoFromRemoteUrl('git@github.com:owner/repo.git')).toEqual({
owner: 'owner',
repo: 'repo',
});
});

test('parses SSH URL from non-github host', () => {
expect(git.parseOwnerRepoFromRemoteUrl('git@gitlab.example.com:team/project.git')).toEqual({
owner: 'team',
repo: 'project',
});
});

test('parses ssh:// scheme', () => {
expect(git.parseOwnerRepoFromRemoteUrl('ssh://git@github.com/owner/repo.git')).toEqual({
owner: 'owner',
repo: 'repo',
});
});

test('strips trailing slashes', () => {
expect(git.parseOwnerRepoFromRemoteUrl('https://github.com/owner/repo/')).toEqual({
owner: 'owner',
repo: 'repo',
});
});

test('returns subgroup/repo for GitLab subgroups (two-level slug)', () => {
// GitLab allows nested groups. Our workspace layout is two-level, so we
// take the last two segments. This is a deliberate simplification.
expect(git.parseOwnerRepoFromRemoteUrl('https://gitlab.com/group/subgroup/repo.git')).toEqual(
{ owner: 'subgroup', repo: 'repo' }
);
});

test('returns null for empty string', () => {
expect(git.parseOwnerRepoFromRemoteUrl('')).toBeNull();
expect(git.parseOwnerRepoFromRemoteUrl(' ')).toBeNull();
});

test('returns null for single-segment input', () => {
expect(git.parseOwnerRepoFromRemoteUrl('repo')).toBeNull();
});

// ─── Path-traversal safety (defense in depth) ─────────────────────────
// These cases simulate what a crafted `git remote set-url origin ...`
// could inject. The parser must reject any owner/repo segment that
// could steer downstream path construction outside ~/.archon/workspaces/.

test('rejects .. as owner segment', () => {
expect(git.parseOwnerRepoFromRemoteUrl('git@host:../evil.git')).toBeNull();
});

test('rejects .. as repo segment', () => {
// `git@host:owner/..` normalizes to `https://__ssh__/owner/..`, the split
// produces owner='owner' repo='..' — must be rejected.
expect(git.parseOwnerRepoFromRemoteUrl('git@host:owner/..')).toBeNull();
});

test('rejects . (single-dot) segments', () => {
expect(git.parseOwnerRepoFromRemoteUrl('git@host:./repo')).toBeNull();
expect(git.parseOwnerRepoFromRemoteUrl('git@host:owner/.')).toBeNull();
});

test('rejects segments containing backslash', () => {
// An attacker-controlled remote URL could embed a backslash which on
// Windows or weird normalization paths is a separator.
expect(git.parseOwnerRepoFromRemoteUrl('git@host:ow\\ner/repo')).toBeNull();
});

test('rejects segments containing whitespace', () => {
expect(git.parseOwnerRepoFromRemoteUrl('git@host:own er/repo')).toBeNull();
expect(git.parseOwnerRepoFromRemoteUrl('git@host:owner/re\tpo')).toBeNull();
});

test('rejects segments containing shell metacharacters', () => {
expect(git.parseOwnerRepoFromRemoteUrl('git@host:owner/re;po')).toBeNull();
expect(git.parseOwnerRepoFromRemoteUrl('git@host:owner/re$po')).toBeNull();
expect(git.parseOwnerRepoFromRemoteUrl('git@host:owner/re|po')).toBeNull();
});

test('rejects segments containing null bytes', () => {
expect(git.parseOwnerRepoFromRemoteUrl('git@host:owner/re\x00po')).toBeNull();
});

test('accepts common legitimate slug characters', () => {
expect(git.parseOwnerRepoFromRemoteUrl('git@host:my-org/my_repo.js.git')).toEqual({
owner: 'my-org',
repo: 'my_repo.js',
});
});
});

describe('parseOwnerRepoFromGitRemote (async)', () => {
test('returns {owner, repo} for a real git repo with https origin', async () => {
const { execFileAsync } = git;
await execFileAsync('git', ['-C', testDir, 'init']);
await execFileAsync('git', [
'-C',
testDir,
'remote',
'add',
'origin',
'https://github.com/test-owner/test-repo.git',
]);

const result = await git.parseOwnerRepoFromGitRemote(testDir);
expect(result).toEqual({ owner: 'test-owner', repo: 'test-repo' });
});

test('returns null for a git repo with no origin', async () => {
const { execFileAsync } = git;
await execFileAsync('git', ['-C', testDir, 'init']);
// No remote added
const result = await git.parseOwnerRepoFromGitRemote(testDir);
expect(result).toBeNull();
});

test('returns null when cwd is not a git repo', async () => {
const nonRepoDir = join(tmpdir(), `non-repo-${String(Date.now())}`);
await realMkdir(nonRepoDir, { recursive: true });
try {
const result = await git.parseOwnerRepoFromGitRemote(nonRepoDir);
expect(result).toBeNull();
} finally {
await rm(nonRepoDir, { recursive: true, force: true });
}
});
});
});
2 changes: 2 additions & 0 deletions packages/git/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ export {
export {
findRepoRoot,
getRemoteUrl,
parseOwnerRepoFromGitRemote,
parseOwnerRepoFromRemoteUrl,
syncWorkspace,
cloneRepository,
syncRepository,
Expand Down
101 changes: 101 additions & 0 deletions packages/git/src/repo.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,107 @@ export async function findRepoRoot(startPath: string): Promise<RepoPath | null>
}
}

/**
* Safe path segment pattern for owner/repo slugs. Rejects anything that
* could escape `~/.archon/workspaces/<owner>/<repo>/` — path separators,
* null bytes, dot segments, and control characters.
*
* Matches the character set GitHub and GitLab both allow in user/org and
* repo names (alphanumerics plus `-`, `_`, `.`). Strings containing only
* dots (`.`, `..`) are explicitly rejected by the additional check in
* `isSafePathSegment` below.
*/
const SAFE_SLUG_PATTERN = /^[A-Za-z0-9._-]+$/;

/**
* Return true iff `s` is safe to use as a single path segment when
* constructing a `~/.archon/workspaces/<owner>/<repo>/` directory path.
*
* Rejects:
* - Empty strings
* - `.` and `..` (directory traversal)
* - Strings containing path separators, control characters, or other
* characters outside the GitHub/GitLab-compatible slug character set
*/
function isSafePathSegment(s: string): boolean {
if (!s) return false;
if (s === '.' || s === '..') return false;
return SAFE_SLUG_PATTERN.test(s);
}

/**
* Parse a git remote URL into an `{ owner, repo }` slug suitable for use
* under `~/.archon/workspaces/`. Pure function — no I/O, safe to reuse.
*
* Handles common URL shapes:
* - `https://github.com/owner/repo(.git)?`
* - `git@github.com:owner/repo(.git)?`
* - `ssh://git@host/owner/repo(.git)?`
* - `https://gitlab.com/group/subgroup/repo.git` (returns `subgroup/repo`;
* the leading group segment is dropped because the workspace layout is
* two-level: `~/.archon/workspaces/<owner>/<repo>/`)
*
* Returns `null` for unparseable URLs **or** for URLs whose owner/repo
* segments contain anything outside the GitHub/GitLab-compatible safe
* slug character set (see `isSafePathSegment`). This prevents a crafted
* remote URL (`git remote set-url origin git@host:../../evil`) from
* steering workspace path construction outside `~/.archon/workspaces/`.
*/
export function parseOwnerRepoFromRemoteUrl(url: string): { owner: string; repo: string } | null {
const trimmed = url.trim();
if (!trimmed) return null;

// Strip trailing .git and any trailing slashes
const cleaned = trimmed.replace(/\.git$/, '').replace(/\/+$/, '');

// SSH form `git@host:path` → normalize so a single split works
const normalized =
cleaned.startsWith('git@') && cleaned.includes(':') && !cleaned.includes('://')
? cleaned.replace(/^[^@]+@[^:]+:/, 'https://__ssh__/')
: cleaned;

const parts = normalized.split('/').filter(Boolean);
const repo = parts.pop();
const owner = parts.pop();
if (!owner || !repo) return null;

// Defense-in-depth: reject any segment that could be used for directory
// traversal or contains unsafe characters. Callers rely on the result
// being usable as literal path components without further sanitization.
if (!isSafePathSegment(owner) || !isSafePathSegment(repo)) return null;

return { owner, repo };
}

/**
* Fetch `origin` for the repo containing `cwd` and parse it into an
* `{ owner, repo }` slug suitable for use under `~/.archon/workspaces/`.
*
* Returns `null` on ANY failure (not a git repo, no origin, unparseable URL,
* git binary missing, permission denied). Callers should treat null as
* "this cwd is not a recognizable project" and fall back silently — no
* throwing, no logging at warn/error level.
*
* Unlike `getRemoteUrl`, this takes a plain string cwd (not a branded
* `RepoPath`) so it is ergonomic to call from CLI entry points and workflow
* execution where the cwd has not yet been type-validated.
*/
export async function parseOwnerRepoFromGitRemote(
cwd: string
): Promise<{ owner: string; repo: string } | null> {
let stdout: string;
try {
const result = await execFileAsync('git', ['-C', cwd, 'remote', 'get-url', 'origin'], {
timeout: 10000,
});
stdout = result.stdout;
} catch {
// Not a repo, no remote, git missing, etc. — silent null.
return null;
}
return parseOwnerRepoFromRemoteUrl(stdout);
}

/**
* Get the remote URL for origin (if it exists)
* Returns null if no remote is configured
Expand Down
14 changes: 14 additions & 0 deletions packages/paths/src/archon-paths.ts
Original file line number Diff line number Diff line change
Expand Up @@ -268,6 +268,20 @@ export function getProjectLogsPath(owner: string, repo: string): string {
return join(getProjectRoot(owner, repo), 'logs');
}

/**
* Get the per-project, per-user Archon config directory.
* Returns: ~/.archon/workspaces/owner/repo/.archon/
*
* This is the third tier in workflow/command/script resolution:
* repo-local (`<cwd>/.archon/`) > this > user-global (`~/.archon/.archon/`) > defaults.
*
* Lets a user keep per-project automation outside the team repo (which
* doesn't want it committed) without making it apply to every project.
*/
export function getProjectArchonDir(owner: string, repo: string): string {
return join(getProjectRoot(owner, repo), '.archon');
}

/**
* Get the artifacts directory for a specific workflow run.
* Returns: ~/.archon/workspaces/owner/repo/artifacts/runs/{id}/
Expand Down
1 change: 1 addition & 0 deletions packages/paths/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ export {
getProjectWorktreesPath,
getProjectArtifactsPath,
getProjectLogsPath,
getProjectArchonDir,
getRunArtifactsPath,
getRunLogPath,
resolveProjectRootFromCwd,
Expand Down
2 changes: 1 addition & 1 deletion packages/workflows/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
"./test-utils": "./src/test-utils.ts"
},
"scripts": {
"test": "bun test src/dag-executor.test.ts && bun test src/loader.test.ts && bun test src/logger.test.ts && bun test src/condition-evaluator.test.ts && bun test src/event-emitter.test.ts && bun test src/executor-shared.test.ts && bun test src/executor.test.ts && bun test src/executor-preamble.test.ts && bun test src/defaults/ src/model-validation.test.ts src/router.test.ts src/utils/ src/hooks.test.ts && bun test src/validation-parser.test.ts src/schemas.test.ts src/command-validation.test.ts && bun test src/validator.test.ts && bun test src/script-discovery.test.ts && bun test src/runtime-check.test.ts && bun test src/script-node-deps.test.ts",
"test": "bun test src/dag-executor.test.ts && bun test src/loader.test.ts && bun test src/logger.test.ts && bun test src/condition-evaluator.test.ts && bun test src/event-emitter.test.ts && bun test src/executor-shared.test.ts && bun test src/executor-shared-command-global.test.ts && bun test src/executor.test.ts && bun test src/executor-preamble.test.ts && bun test src/defaults/ src/model-validation.test.ts src/router.test.ts src/utils/ src/hooks.test.ts && bun test src/validation-parser.test.ts src/schemas.test.ts src/command-validation.test.ts && bun test src/validator.test.ts && bun test src/script-discovery.test.ts && bun test src/runtime-check.test.ts && bun test src/script-node-deps.test.ts && bun test src/dag-executor-script-global.test.ts",
"type-check": "bun x tsc --noEmit"
},
"dependencies": {
Expand Down
Loading