Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 25 additions & 1 deletion gitnexus/src/cli/clean.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,13 @@
*/

import fs from 'fs/promises';
import { findRepo, unregisterRepo, listRegisteredRepos } from '../storage/repo-manager.js';
import {
findRepo,
unregisterRepo,
listRegisteredRepos,
assertSafeStoragePath,
UnsafeStoragePathError,
} from '../storage/repo-manager.js';

export const cleanCommand = async (options?: { force?: boolean; all?: boolean }) => {
// --all flag: clean all indexed repos
Expand All @@ -27,6 +33,24 @@ export const cleanCommand = async (options?: { force?: boolean; all?: boolean })

const entries = await listRegisteredRepos();
for (const entry of entries) {
// Safety guard (#1003 review — @magyargergo): same rationale as
// remove.ts. `~/.gitnexus/registry.json` is user-writable, so a
// corrupted or hand-edited entry could point storagePath at the
// repo root, an empty string, or anywhere else — and
// fs.rm(recursive: true) on any of those would be catastrophic.
// Skip poisoned entries without touching disk, but keep going
// through the rest of the registry (preserves the existing
// per-repo error-tolerance semantics of `clean --all`).
try {
assertSafeStoragePath(entry);
} catch (err) {
if (err instanceof UnsafeStoragePathError) {
console.error(`Refusing to clean ${entry.name}: ${err.message}`);
continue;
}
throw err;
}

try {
await fs.rm(entry.storagePath, { recursive: true, force: true });
await unregisterRepo(entry.path);
Expand Down
9 changes: 9 additions & 0 deletions gitnexus/src/cli/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,15 @@ program
.option('--all', 'Clean all indexed repos')
.action(createLazyAction(() => import('./clean.js'), 'cleanCommand'));

program
.command('remove <target>')
.description(
'Delete the GitNexus index for a registered repo (by alias, name, or absolute path). ' +
'Unlike `clean`, does not require being inside the repo. Idempotent on unknown targets.',
)
.option('-f, --force', 'Skip confirmation prompt')
.action(createLazyAction(() => import('./remove.js'), 'removeCommand'));

program
.command('wiki [path]')
.description('Generate repository wiki from knowledge graph')
Expand Down
110 changes: 110 additions & 0 deletions gitnexus/src/cli/remove.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
/**
* Remove Command (#664)
*
* Delete the `.gitnexus/` index for a registered repo and unregister it
* from the global registry (~/.gitnexus/registry.json). The target is
* identified by alias / basename-derived name / remote-inferred name /
* absolute path — no `--repo` flag, just a positional argument so the
* destructive-command ergonomics match `clean` (which is also
* destructive but scoped to `process.cwd()`).
*
* Compared to `clean`:
* - `clean` acts on the repo discovered by walking up from cwd.
* - `remove` acts on any registered repo identified by name or path.
*
* Behaviour notes:
* - Idempotent on unknown targets: exits 0 with a warning so that
* `remove X && analyze Y` keeps working in scripts. Per #664:
* "behave atomically and idempotently so retries are safe".
* - Atomic order mirrors `clean`: fs.rm FIRST, then unregister. A
* partial failure leaves the registry pointing at a missing dir
* (recoverable by `listRegisteredRepos({ validate: true })` on
* next read) rather than the opposite, which would orphan
* .gitnexus/ directories on disk.
* - `-f` / `--force` matches the confirmation-skip semantics of
* `clean -f`. (Distinct from `analyze --force`, which re-indexes;
* here there is no pipeline, so no conflation.)
*/

import fs from 'fs/promises';
import {
readRegistry,
resolveRegistryEntry,
assertSafeStoragePath,
unregisterRepo,
RegistryNotFoundError,
RegistryAmbiguousTargetError,
UnsafeStoragePathError,
} from '../storage/repo-manager.js';

export const removeCommand = async (target: string, options?: { force?: boolean }) => {
// Read the registry snapshot once and pass it to the resolver — this
// lets us render the "before" state in the dry-run path without a
// second disk read.
const entries = await readRegistry();

let entry;
try {
entry = resolveRegistryEntry(entries, target);
} catch (err) {
if (err instanceof RegistryNotFoundError) {
// Idempotent: missing target is a no-op warning, not an error.
// The `availableNames` hint comes from the error itself so users
// can see what they might have meant.
console.warn(`Nothing to remove: ${err.message}`);
return;
}
if (err instanceof RegistryAmbiguousTargetError) {
// Duplicate aliases are allowed via --allow-duplicate-name (#829);
// refuse to guess which one the user meant — surface the full list
// and exit non-zero so scripts don't silently pick the wrong repo.
console.error(`Error: ${err.message}`);
process.exit(1);
}
throw err;
}

// Confirmation gate — same shape as `clean`. Default is a dry-run
// that describes what would be deleted; `--force` actually deletes.
if (!options?.force) {
console.log(`This will delete the GitNexus index for: ${entry.name}`);
console.log(` Path: ${entry.path}`);
console.log(` Storage: ${entry.storagePath}`);
console.log('\nRun with --force to confirm deletion.');
return;
}

// Safety guard (#1003 review — @magyargergo): refuse to proceed if
// the registry entry's `storagePath` isn't the canonical
// `<entry.path>/.gitnexus` subfolder. `~/.gitnexus/registry.json` is
// user-writable, so a corrupted or hand-edited entry could point
// storagePath at the repo root, an empty string (→ cwd), a parent
// dir, or anywhere else; `fs.rm(recursive: true, force: true)` on
// any of those would be a runtime disaster. Bail before touching
// disk, with an actionable hint for recovering a broken registry.
try {
assertSafeStoragePath(entry);
} catch (err) {
if (err instanceof UnsafeStoragePathError) {
console.error(`Error: ${err.message}`);
process.exit(1);
}
throw err;
}

// Deletion order: fs.rm first, then unregister. If fs.rm fails mid-way,
// the registry entry stays so the user can retry. If fs.rm succeeds but
// unregister throws (e.g. ENOSPC on registry write), the entry becomes
// orphaned — `listRegisteredRepos({ validate: true })` prunes those on
// next read, so the failure is self-healing.
try {
await fs.rm(entry.storagePath, { recursive: true, force: true });
await unregisterRepo(entry.path);
console.log(`Removed: ${entry.name}`);
console.log(` Path: ${entry.path}`);
console.log(` Storage: ${entry.storagePath}`);
} catch (err) {
console.error(`Failed to remove ${entry.name}:`, err);
process.exit(1);
}
Comment on lines +100 to +109

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be very careful here! We don't want to remove the physical path of the code base. We can safely remove files/folders int the .gitnexus folder but outside is prohibited. Please introduce safe guards here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thank you. Addressed in 610ee9b9 with a guard that blocks destructive fs.rm whenever the registry entry's storagePath isn't the canonical <entry.path>/.gitnexus subfolder.

The threat model

~/.gitnexus/registry.json is a plain-text user-writable file. A corrupted or hand-edited entry could plausibly end up with:

  • storagePath === entry.path (the repo root → catastrophic: fs.rm recursively wipes the working tree)
  • storagePath === "" (path.resolve resolves to cwd → rm cwd)
  • storagePath pointing at a parent dir, sibling dir, or anywhere else

fs.rm(recursive: true, force: true) on any of those is a runtime disaster.

Fix shape

New UnsafeStoragePathError + exported assertSafeStoragePath() in repo-manager.ts. Pure lexical string check (Windows case-insensitive) asserting entry.storagePath === path.join(entry.path, '.gitnexus'). Does NOT depend on the paths existing — it's a structural integrity check on the registry, not a filesystem probe.

Audit & sibling fix

Before committing I audited every fs.rm(...storagePath...) site in the codebase:

Site Source of storagePath Safety
remove.ts entry.storagePath from registry guarded
clean.ts --all entry.storagePath from registry (same pattern) guarded (found during the audit — same vulnerability, fixed in the same commit)
clean.ts default findRepo(cwd) → lexically recomputed ✅ safe by construction
server/api.ts getStoragePath(entry.path) → lexically recomputed ✅ safe by construction

clean --all skips poisoned entries with a warning and continues with the rest of the batch — preserves its existing per-repo error-tolerance semantics (one bad entry doesn't halt cleanup).

Tests

  • 8 unit tests on the guard: valid <repo>/.gitnexus, repo-root-as-storage (catastrophic case), parent-as-storage, empty storage (→ cwd), totally-unrelated path, sibling .gitnexus (right basename, wrong parent), error payload shape, Windows case-insensitive acceptance.
  • 2 integration tests: poisons a registry entry's storagePath to the repo root, runs the destructive command, asserts (a) exit code, (b) the working tree + .git/ + .gitnexus/ all survive on disk, (c) registry state is correct (poisoned entry retained, valid siblings cleaned). One test covers remove, one covers clean --all with a mixed good/bad registry.

Local verification

  • tsc --noEmit clean
  • repo-manager.test.ts 49/49
  • cli-e2e.test.ts -t "remove|clean --all" 4/4 (3 remove + 1 new clean --all poisoned)
  • Full unit suite 4177 pass + 2 pre-existing git-utils env failures (unchanged, unrelated)

CI on 610ee9b9 should light up green shortly.

};
Loading
Loading