Skip to content

fix(host-service): make v2 worktree removal idempotent#5075

Open
saddlepaddle wants to merge 1 commit into
mainfrom
fix/workspace-deletion-race
Open

fix(host-service): make v2 worktree removal idempotent#5075
saddlepaddle wants to merge 1 commit into
mainfrom
fix/workspace-deletion-race

Conversation

@saddlepaddle
Copy link
Copy Markdown
Collaborator

@saddlepaddle saddlepaddle commented Jun 3, 2026

Problem

v2 workspace deletion fails >50% of the time with:

Failed to delete <name>: Failed to remove worktree at <path>: fatal: '<path>' is not a working tree

Reported by Mike McQuaid — "feels like perhaps a race condition based on when I close windows."

Root cause

git worktree remove emits fatal: '<path>' is not a working tree when the path is no longer a registered worktree (its .git/worktrees/<id> metadata is gone) — even though the directory still exists on disk. I confirmed this against real git:

State git worktree remove --force --force
directory deleted, metadata intact succeeds (exit 0)
directory exists, metadata deregistered fatal: '…' is not a working tree (exit 128)

The destroy saga in workspace-cleanup.ts only treated !existsSync(path) as the success-equivalent fallback. In the deregistered-but-lingering case existsSync is true, so it re-threw — and since git can never re-remove an unregistered worktree, every retry failed permanently, producing the >50% rate.

How the worktree gets deregistered while its directory lingers: workspaces.create runs an unconditional git worktree prune on every call — including the idempotent "open"/refresh that fires on window events. When that prune races a partially-completed prior delete (directory left behind, or momentarily moved), it strips the metadata, and the workspace becomes permanently undeletable via the git path. This matches the "based on when I close windows" symptom.

Fix

Make worktree removal idempotent. When git worktree remove reports the path is not a working tree and the directory still exists, treat it as already-deregistered: git worktree prune any stale metadata, remove the leftover directory directly, and continue the deletion saga (cloud delete → branch delete → sqlite cleanup). The end state — worktree gone from git, directory gone from disk, cloud/sqlite rows gone — is exactly what deletion wants, so this converges to the correct result regardless of which path deregistered the worktree first.

Genuine failures (directory persists for any other reason — permissions, locked, broken repo) still throw as before.

Tests

  • Unit (workspace-cleanup.test.ts): git worktree remove throws "is not a working tree" with the directory present → asserts prune runs, the leftover directory is removed, and the delete proceeds to cloud delete + success.
  • Integration, real git (workspace-cleanup.integration.test.ts): prunes the worktree metadata while the directory is briefly moved aside, restores the directory to reproduce the exact deregistered-but-lingering state, asserts git no longer lists the path but it exists on disk, then asserts destroy removes everything cleanly with no warnings. This test fails against the old code.

All host-service tests pass; typecheck and biome clean.


Open in Stage

Summary by cubic

Make v2 workspace deletion idempotent to stop frequent failures when a Git worktree is deregistered but its directory still exists. This removes the >50% "is not a working tree" error caused by a race with git worktree prune.

  • Bug Fixes
    • On git worktree remove failure with "is not a working tree" and the directory still present, treat it as already deregistered.
    • Run git worktree prune, remove the leftover directory directly, then continue deletion (cloud, branch, DB).
    • Genuine failures still throw; added unit and real-git integration tests to cover this path.

Written for commit da27d3b. Summary will update on new commits.

Review in cubic

Summary by CodeRabbit

  • Bug Fixes

    • Improved workspace cleanup robustness to handle deregistered worktrees that linger on disk, allowing cleanup to complete successfully instead of failing.
  • Tests

    • Added integration and unit tests to verify graceful cleanup of orphaned worktree directories.

v2 workspace deletion failed >50% of the time with
`fatal: '<path>' is not a working tree`.

git emits that error from `git worktree remove` when the path is no
longer a registered worktree (its `.git/worktrees/<id>` metadata is
gone) even though the directory still exists on disk. The destroy saga's
fallback only treated `!existsSync(path)` as success, so this state
re-threw — and because git can never re-remove an unregistered worktree,
every retry failed permanently.

The metadata gets deregistered by the unconditional `git worktree prune`
that `workspaces.create` runs on every call (including the idempotent
"open"/refresh fired on window events) racing a partially-completed
prior delete that left the directory behind. That matches the reported
"feels like a race condition based on when I close windows."

Fix: when `git worktree remove` reports the path is not a working tree
and the directory still exists, treat it as already-deregistered —
`git worktree prune` any stale metadata, then remove the leftover
directory directly and continue the saga. Genuine failures (the
directory persists for any other reason) still throw.

Regression tests: a unit test for the deregistered-but-lingering branch,
and a real-git integration test that prunes the metadata while the
directory lingers and asserts the delete now completes cleanly.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 3, 2026

📝 Walkthrough

Walkthrough

The PR adds graceful recovery for workspace cleanup when git worktree removal fails due to missing worktree metadata. When git worktree remove encounters a "not a working tree" error, the routine now calls git worktree prune and forcibly removes the lingering directory recursively. Implementation, unit tests, and integration tests are added together.

Changes

Deregistered Worktree Recovery in Workspace Cleanup

Layer / File(s) Summary
Error handling contract
packages/host-service/src/trpc/router/workspace-cleanup/workspace-cleanup.ts
Introduces a NOT_A_WORKING_TREE regex constant to identify the specific failure when worktree metadata is missing or deregistered.
Deregistered worktree cleanup implementation
packages/host-service/src/trpc/router/workspace-cleanup/workspace-cleanup.ts
Extends the worktree removal error handler: when the "not a working tree" error is detected, calls git worktree prune (best-effort) and then recursively removes the leftover directory via rm. Adds rm import from node:fs/promises to enable filesystem deletion.
Unit test harness updates
packages/host-service/test/workspace-cleanup.test.ts
Extends ContextSpec with an optional worktreePrune callback and wires it into the mocked git.raw handler for ["worktree","prune"]. Adds existsSync import for filesystem assertions.
Unit test for deregistered worktree scenario
packages/host-service/test/workspace-cleanup.test.ts
Tests the cleanup ordering when worktree removal fails: validates that destroy still succeeds, cloud deletion is called exactly once, prune is performed exactly once, and the leftover directory is removed from disk.
Integration test for deregistered worktree scenario
packages/host-service/test/integration/workspace-cleanup.integration.test.ts
End-to-end test simulating the lingering-directory scenario by temporarily renaming the worktree, pruning, and restoring it; validates complete cleanup including disk removal, branch deletion, database row clearing, and cloud API invocation.

Sequence Diagram

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • superset-sh/superset#4801: Updates the same workspace cleanup destroy workflow and associated tests to handle worktree/local cleanup edge cases before cloud deletion.
  • superset-sh/superset#4693: Modifies the same git worktree remove error handling in workspace-cleanup.ts to treat specific failure modes as non-fatal with recovery logic (locked+missing scenario).

Poem

🐰 A worktree lost its way one day,
The metadata vanished—oh what dismay!
But now we prune and gently sweep,
Remove what lingers, dark and deep,
Your workspace cleanup's robust and spry! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the primary change: making v2 worktree removal idempotent to fix >50% deletion failures.
Description check ✅ Passed The description comprehensively covers Problem, Root Cause, Fix, Tests, and provides clear context; exceeds the template requirements.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/workspace-deletion-race

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@stage-review
Copy link
Copy Markdown

stage-review Bot commented Jun 3, 2026

Ready to review this PR? Stage has broken it down into 3 individual chapters for you:

Title
1 Implement idempotent worktree removal
2 Add unit tests for idempotent cleanup
3 Add integration test with real git
Open in Stage

Chapters generated by Stage for commit da27d3b on Jun 3, 2026 3:24pm UTC.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/host-service/src/trpc/router/workspace-cleanup/workspace-cleanup.ts (1)

328-354: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Harden NOT_A_WORKING_TREE matching against localized git error text (packages/host-service/src/trpc/router/workspace-cleanup/workspace-cleanup.ts, lines 328-354)

The recovery path matches err.message against NOT_A_WORKING_TREE = /is not a working tree/i. Git diagnostic strings can be translated via gettext, and the git execution path here doesn’t pin LC_ALL/LANG/LC_MESSAGES (credential providers only set GIT_TERMINAL_PROMPT/GIT_ASKPASS, and shared simple-git options only set “unsafe” flags). If the message is localized, the regex won’t match and you’ll fall back to the INTERNAL_SERVER_ERROR throw instead of pruning/removing the leftover worktree.

Set a stable locale (e.g. run the worktree remove/recovery call with LC_ALL=C via git.env({ ... , LC_ALL: "C" }), or equivalent) so the recovery doesn’t depend on runtime locale.


ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5d219e0f-d16b-4c43-b463-864b9be412a8

📥 Commits

Reviewing files that changed from the base of the PR and between 30ff989 and da27d3b.

📒 Files selected for processing (3)
  • packages/host-service/src/trpc/router/workspace-cleanup/workspace-cleanup.ts
  • packages/host-service/test/integration/workspace-cleanup.integration.test.ts
  • packages/host-service/test/workspace-cleanup.test.ts

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 3, 2026

🚀 Preview Deployment

🔗 Preview Links

Service Status Link
Neon Database (Neon) View Branch
Vercel API (Vercel) Open Preview
Vercel Web (Vercel) Open Preview
Vercel Marketing (Vercel) Open Preview
Vercel Admin (Vercel) Open Preview
Vercel Docs (Vercel) Open Preview

Preview updates automatically with new commits

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 3 files

Architecture diagram
sequenceDiagram
    participant UI as Client / Window
    participant API as TRPC Router
    participant Cleanup as workspace-cleanup.ts runDestroy
    participant Git as git CLI (worktree)
    participant FS as File System
    participant DB as SQLite (workspaces)
    participant Cloud as Cloud API (v2Workspace.delete)

    Note over UI,Cloud: v2 Workspace Deletion Flow

    UI->>API: mutate(workspaceCleanup.destroy)
    API->>Cleanup: runDestroy(workspaceId)

    Cleanup->>Git: raw(["worktree", "remove", path])
    alt Worktree remove succeeds (normal case)
        Git-->>Cleanup: exit 0
        Cleanup->>FS: directory already gone or git removed it
        Cleanup->>Cloud: mutate(v2Workspace.delete.mutate)
        Cloud-->>Cleanup: result
        Cleanup->>DB: delete workspace row
        Cleanup-->>API: success
        API-->>UI: { success: true }
    else Git throws "is not a working tree" (deregistered + directory lingers)
        Git-->>Cleanup: exit 128, error message
        Cleanup->>FS: existsSync(path) = true
        alt Message matches /is not a working tree/i
            Cleanup->>Git: raw(["worktree", "prune"])
            Git-->>Cleanup: ok (stale metadata removed)
            Cleanup->>FS: rm(path, { recursive: true, force: true })
            alt rm succeeds
                Cleanup->>Cloud: mutate(v2Workspace.delete.mutate) — continues saga
                Cloud-->>Cleanup: result
                Cleanup->>DB: delete workspace row
                Cleanup-->>API: success
                API-->>UI: { success: true }
            else rm fails (permissions, locked, etc.)
                Cleanup-->>API: TRPCError INTERNAL_SERVER_ERROR
                API-->>UI: error
            end
        else Other git error (genuine failure, e.g. permissions)
            Cleanup-->>API: TRPCError INTERNAL_SERVER_ERROR
            API-->>UI: error
        end
    else Git throws but directory already gone
        Git-->>Cleanup: exit error
        Cleanup->>FS: existsSync(path) = false
        Cleanup-->>Cleanup: worktreeRemoved = true
        Cleanup->>Cloud: mutate(v2Workspace.delete.mutate)
        Cloud-->>Cleanup: result
        Cleanup->>DB: delete workspace row
        Cleanup-->>API: success
        API-->>UI: { success: true }
    end
Loading

Re-trigger cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant