Skip to content

fix: fail local hatch if base data directory already exists#9322

Merged
dvargasfuertes merged 1 commit into
mainfrom
check-base-data-dir-conflict
Feb 26, 2026
Merged

fix: fail local hatch if base data directory already exists#9322
dvargasfuertes merged 1 commit into
mainfrom
check-base-data-dir-conflict

Conversation

@dvargasfuertes
Copy link
Copy Markdown
Contributor

@dvargasfuertes dvargasfuertes commented Feb 26, 2026

Summary

  • Checks if the .vellum base data directory already exists before starting a local hatch
  • Fails early with a clear error message if the directory is taken, preventing accidental data overwrites
  • Suggests setting BASE_DATA_DIR to use a different directory
  • Moved baseDataDir computation earlier in hatchLocal() to enable the check before starting the daemon/gateway

Test plan

  • All 35 CLI tests pass
  • vellum hatch fails with clear error when ~/.vellum already exists
  • vellum hatch succeeds on fresh machine with no ~/.vellum
  • BASE_DATA_DIR=/tmp/alt vellum hatch works when ~/.vellum exists but /tmp/alt/.vellum doesn't

🤖 Generated with Claude Code


Open with Devin

Prevents accidentally overwriting an existing assistant's data by
checking if the .vellum directory exists before proceeding with hatch.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dvargasfuertes dvargasfuertes merged commit 33654b2 into main Feb 26, 2026
1 check passed
@dvargasfuertes dvargasfuertes deleted the check-base-data-dir-conflict branch February 26, 2026 01:33
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1ac1f8f206

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread cli/src/commands/hatch.ts
Comment on lines +543 to +544
if (existsSync(baseDataDir)) {
throw new Error(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Narrow base-data-dir existence check before local hatch

This check fails hatch local whenever ~/.vellum already exists, but that directory is also created by unrelated setup paths (for example vellum config set creates ~/.vellum/workspace/config.json in cli/src/commands/config.ts), so users can be blocked from hatching even when no local assistant is running. In practice, a documented pre-hatch config step now causes a hard failure, so the guard should detect active/conflicting local runtime state rather than treating any preexisting directory as an overwrite risk.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 4 additional findings in Devin Review.

Open in Devin Review

Comment thread cli/src/commands/hatch.ts
Comment on lines +543 to +549
if (existsSync(baseDataDir)) {
throw new Error(
`Base data directory already exists: ${baseDataDir}\n` +
" Another assistant may already be using this directory.\n" +
" To use a different directory, set the BASE_DATA_DIR environment variable.",
);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 existsSync check on baseDataDir blocks re-hatching after any failed hatch or incomplete cleanup

The new existsSync(baseDataDir) check at line 543 throws an error if ~/.vellum already exists, but this directory is routinely created by the daemon during startLocalDaemon() and is NOT removed by stopLocalProcesses(). This makes vellum hatch permanently broken after any failed hatch attempt.

Detailed failure scenarios and root cause

Scenario 1: Failed hatch recovery is impossible

If a previous hatchLocal() call succeeds at startLocalDaemon() (line 555) but fails at startGateway() (line 559), the error handler at lines 560-565 calls stopLocalProcesses(), which only removes PID and socket files (cli/src/lib/process.ts:85-89) — it does NOT remove the ~/.vellum directory. On the next vellum hatch attempt, the existsSync(baseDataDir) check at line 543 finds the leftover directory and throws, preventing recovery without manual rm -rf ~/.vellum.

Scenario 2: Contradicts stale cleanup code above

The stale process cleanup at lines 529-539 explicitly handles the case where ~/.vellum exists with PID files but no lock file entries — it cleans up stale processes and continues. But the new check at line 543 then immediately throws because the directory still exists after cleanup. The stale cleanup code becomes dead code in practice.

Scenario 3: Desktop daemon creates the directory

In desktop mode, startLocalDaemon() calls mkdirSync(vellumDir, { recursive: true }) at cli/src/lib/local.ts:181. While this happens after the check, any previous daemon run would have created the directory, and stopLocalProcesses() doesn't remove it.

Impact: vellum hatch --remote local is broken for any user who has previously attempted a local hatch that didn't complete cleanly, or whose ~/.vellum directory exists for any reason other than an active assistant. The only workaround is manual rm -rf ~/.vellum.

Prompt for agents
The existsSync check on the entire baseDataDir (~/.vellum) is too broad. This directory is used for PID files, socket files, and other runtime state that persists across hatch/retire cycles. The check should be more targeted — for example, checking for a specific sentinel file that indicates an active assistant is using the directory (e.g., a running daemon's PID file with a live process, or a specific lock/marker file), rather than checking for the directory's mere existence.

In cli/src/commands/hatch.ts, lines 543-549, replace the existsSync(baseDataDir) check with a more targeted check. Options include:
1. Check if there's an active daemon process using isProcessAlive() on the PID file inside baseDataDir
2. Check for a specific marker file that the daemon creates when actively using the directory
3. Remove the check entirely and rely on the existing stale cleanup logic at lines 529-539

The stale cleanup code at lines 529-539 already handles the case of leftover processes with no lock file entries, so the directory existence check is redundant and harmful.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

noanflaherty pushed a commit that referenced this pull request Feb 26, 2026
Prevents accidentally overwriting an existing assistant's data by
checking if the .vellum directory exists before proceeding with hatch.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
noanflaherty added a commit that referenced this pull request Feb 26, 2026
…canonical send to selectThread (#9361)

* fix(macos): add disabled guard to SwiftUI conversation zoom commands (#9310)

The SwiftUI CommandGroup buttons bypass AppKit's validateMenuItem callback,
so conversation zoom shortcuts (Cmd+/−/0) were always active even on
non-conversation panels. Add .disabled() modifiers matching the existing
isConversationVisible guard.

Co-authored-by: Claude <noreply@anthropic.com>

* feat: reskin Mobile card to match Guardian Verification visual pattern (#9311)

Restructure the Mobile settings card so it uses the same row-based layout
as the Telegram, SMS, and Phone Calling cards:

- Connected devices now display as channelStatusRow entries (icon + name +
  Remove button) instead of the old custom list with bordered buttons
- Device Pairing row uses the 140px guardian-label layout with info tooltip,
  mirroring the Guardian Verification rows in other cards
- Status warnings (no gateway, no token, regenerating) display inline in
  the pairing row rather than as standalone blocks below the button
- "Show QR Code" button renamed to "Pair Device" for consistency
- Removed formattedDeviceDate helper (no longer needed)

Co-authored-by: Claude <noreply@anthropic.com>

* fix: preserve text in stripHeavyContent, add early-return guard (#9313)

Co-authored-by: Claude <noreply@anthropic.com>

* fix: guard trimForBackground against in-flight pagination (#9314)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: omit textSegments when truncated, handle legacy tool results in rehydrate (#9315)

Co-authored-by: Claude <noreply@anthropic.com>

* fix: restore Button semantics for heartbeat run row expansion (#9304)

Replace onTapGesture with Button + contentShape(Rectangle()) to
preserve keyboard activation and VoiceOver accessibility while
keeping the full-row tap target.

* fix: clear isContentStripped after rehydration for re-trim eligibility (#9317)

Co-authored-by: Claude <noreply@anthropic.com>

* fix(macos): use macOS 14-compatible SF Symbol for refresh buttons (#9318)

Replace "arrow.trianglehead.2.counterclockwise" (SF Symbols 6 / macOS 15+)
with "arrow.triangle.2.circlepath" (SF Symbols 1 / macOS 11+) so the
refresh icons on Platform, Gateway, and Tunnel status rows actually render
on macOS 14 (Sonoma).

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: positional tool call matching, cache invalidation, contentOrder fix in rehydrate (#9319)

Co-authored-by: Claude <noreply@anthropic.com>

* fix: same-timestamp getNextMessage + separate text truncation flag (#9320)

Co-authored-by: Claude <noreply@anthropic.com>

* fix(macos): improve refresh button UX on Connect status rows (#9321)

- Move refresh icon next to status text instead of right-aligned
- Remove bounce animation when spin stops (use .identity instead of .default)
- Add 0.5s minimum spin duration so the icon always visibly spins

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix(macos): simplify Platform connection error to "Could not connect" (#9324)

Replace verbose error.localizedDescription (e.g. "Could not connect to
the server.") with a concise "Could not connect" in the Platform health
check on the Connect tab.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: fail local hatch if base data directory already exists (#9322)

Prevents accidentally overwriting an existing assistant's data by
checking if the .vellum directory exists before proceeding with hatch.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: populate cachedInputFull on expand via onChange to prevent flash (#9325)

Co-authored-by: Claude <noreply@anthropic.com>

* fix: proper compound cursor for deterministic pagination (#9326)

Co-authored-by: Claude <noreply@anthropic.com>

* fix: prevent pagination timeout from misclassifying late responses (#9327)

Co-authored-by: Claude <noreply@anthropic.com>

* fix(macos): simplify Telegram guardian verification help text into single line (#9328)

Co-authored-by: Claude <noreply@anthropic.com>

* fix(macos): resolve two build errors on main (#9329)

1. Replace .identity animation (unavailable in current Swift) with .default
2. Add public isConversationZoomEnabled accessor so the executable target
   can check conversation visibility without accessing internal members

Co-authored-by: Claude <noreply@anthropic.com>

* fix: remove unused parseFrontmatter function in CLI skills command (#9330)

Co-authored-by: Claude <noreply@anthropic.com>

* feat: add vellum login/logout/whoami commands (#9178)

* feat: add vellum login/logout/whoami commands

Add platform authentication commands to the CLI:
- `vellum login --token <token>` validates and stores a platform session token
- `vellum logout` clears the stored token
- `vellum whoami` shows the currently authenticated user

Token is stored at ~/.vellum/platform-token (same path used by the assistant
daemon), enabling `vel import` to filter assistants by the logged-in user.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use allauth session endpoint instead of removed /v1/users/

The UserViewSet at /v1/users/ was removed from the platform. Switch to
the allauth headless session endpoint (/_allauth/app/v1/auth/session)
which returns user info via X-Session-Token auth.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: respect BASE_DATA_DIR and enforce 0600 on existing token files

- Derive platform token path from BASE_DATA_DIR (matching the assistant
  daemon's getRootDir() behavior) instead of hardcoding homedir()
- Call chmodSync after writeFileSync to ensure 0600 permissions even
  when the file already exists (writeFileSync mode only applies on creation)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix(macos): fix build errors from cross-module access and invalid Animation member (#9332)

MainWindow and MainWindowState need public access since they're accessed
from the vellum-assistant-app executable target. Also fix .identity
(not a valid Animation member) to .default in SettingsConnectTab.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat: outbound guardian verification via code entry in Settings UI (#9333)

Add submit_outbound_code guardian action so users receive a verification
code via SMS/Telegram/voice call and enter it in the macOS Settings UI
instead of displaying a code to copy. Voice calls now speak the code
(no DTMF required). Existing /guardian_verify command fallback preserved.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* revert #9333 + rework guardian: show code in UI, reply in channel (#9334)

* Revert "feat: outbound guardian verification via code entry in Settings UI (#9333)"

This reverts commit b651371.

* revert #9333 + rework guardian verification: show code in UI, user replies in channel

Reverts the "enter code in UI" approach from #9333 and instead:
- SMS/Telegram outbound messages no longer include the verification code
- Messages ask the user to reply with the code shown in the Vellum app
- Voice verification is unchanged (DTMF via phone call)
- The code is only visible in the trusted Settings UI, not sent over the channel

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Changes needed for Vellum Assistant on Mac Mini (#9335)

* feat: accept bare verification codes and simplify guardian verify copy (#9336)

- Accept bare hex/numeric codes as verification replies (no /guardian_verify prefix needed)
- Simplify channel message copy to 'Reply with the code you were given'
- Remove expiry time commentary from verification messages
- Remove guardian_verify from Telegram bot commands
- Update skills and architecture docs to match new flow

Co-authored-by: Claude <noreply@anthropic.com>

* feat: use 6-digit codes for all channels and update verification copy (#9338)

- All channels now use 6-digit numeric verification codes (was 64-char hex for SMS/Telegram)
- SMS/Telegram: 'Vellum assistant guardian verification requested. Reply with the 6-digit code you were given.'
- Voice: 'You are receiving a guardian verification call for your Vellum assistant. Please enter your 6-digit verification code using your keypad.'
- Dropped per-assistant name prefix from templates (now always says Vellum assistant)

Co-authored-by: Claude <noreply@anthropic.com>

* docs: update skills and architecture for 6-digit verification codes (#9339)

Co-authored-by: Claude <noreply@anthropic.com>

* fix: prevent voice/phone call threads from leaking into desktop side nav (#9340)

Voice-bound conversations were appearing in the macOS desktop thread list
after session restore because the filter explicitly allowed sourceChannel=voice.
Guardian verification calls also leaked because they had no channel binding.
Pointer messages set userMessageChannel=voice which contaminated origin channel.

- Remove voice exception from desktop thread filters (restore + pagination)
- Add voice channel binding to guardian verification conversations
- Remove misleading "See voice thread" pointer text and channel metadata

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* docs: align guardian verification copy with create_challenge flow (#9341)

* feat: install CLI symlink during desktop hatch and add --version flag (#9342)

* feat: install CLI symlink during desktop hatch and add --version flag

Creates /usr/local/bin/vellum symlink pointing to the bundled vellum-cli
binary during hatch when VELLUM_DESKTOP_APP is set. This works for both
dev mode (DerivedData) and production (/Applications) builds, complementing
the existing AppDelegate symlink which skips dev mode.

Also adds --version/-v flag support to the CLI entry point.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use lstatSync to handle dangling symlinks during CLI symlink install

existsSync follows symlinks and returns false for broken links, causing
symlinkSync to throw EEXIST on stale symlinks. Now uses lstatSync to
detect and clean up dangling symlinks correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add non-null assertion for Drizzle or() return type in conversation-store (#9351)

Co-authored-by: Claude <noreply@anthropic.com>

* fix: remove unused existsSync import in CLI gcp.ts (#9352)

Co-authored-by: Claude <noreply@anthropic.com>

* fix(notifications): deduplicate conversation-viewed events by moving canonical send to selectThread

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Ashlee Radka <ashleeradka@gmail.com>
Co-authored-by: V <vincent@vellum.ai>
Co-authored-by: David Vargas Fuertes <vargas@vellum.ai>
Co-authored-by: siddseethepalli <siddseethepalli@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant