QA Video Automation: Recording, Finalization & File-backed Attachments by Jasonnnz · Pull Request #6935 · vellum-ai/vellum-assistant

Jasonnnz · 2026-02-23T16:53:07Z

Summary

Enables Velly to run QA/test workflows through computer-use sessions, record the screen, and return a playable + draggable video in chat. Adds QA intent detection, IPC contract extensions, daemon-side finalization handling with summary + attachment injection, macOS screen recording lifecycle, file-backed attachment storage with HTTP range-request streaming, chat video playback with drag-to-Finder support, and configurable retention cleanup.

Changes

QA Intent Detection: detectQaIntent() pattern-matches QA/test commands and natural-language requests (narrowed to avoid false positives on common words like "check")
IPC Contract Extensions: reportToSessionId, qaMode on CuSessionCreate and TaskRouted; conversationId on TaskSubmit; retentionDays, captureScope, includeAudio on TaskRouted; new CuSessionFinalized message with recording metadata
Daemon Routing: Task routing and escalation handlers detect QA intent, set qaMode + reportToSessionId, pass retention/capture config
Client QA Wiring: AppDelegate+Sessions.swift reads qaMode from task_routed, creates ScreenRecorder, passes QA params + retention + capture config to ComputerUseSession; sends conversationId in task_submit
Daemon Finalization Handler: handleCuSessionFinalized creates file-backed attachment (always, for cleanup), injects summary + attachment into source chat (when reportToSessionId present). Uses recordingTracked flag to catch all edge cases.
File-backed Attachment Schema: New storageKind, filePath, sha256, expiresAt columns; createFileBackedAttachment() for large files; backward-compatible
Content Streaming Endpoint: GET /v1/attachments/:id/content with RFC 7233 range request support
macOS Recording Lifecycle: ScreenRecorder using ScreenCaptureKit; integrated into Session.swift; configurable retention + capture scope + audio from daemon config
Capture Scope Resolution: When captureScope is "window", resolves topmost non-self CGWindowID (excludes own PID + filters to layer 0); when "display", passes CGMainDisplayID(). Falls back gracefully.
Chat Video UX: InlineVideoAttachmentView fetches from /content for file-backed attachments; .onDrag for drag-to-Finder
Retention Cleanup: Configurable qaRecording.defaultRetentionDays; background cleanup task; orphan recordings tracked in DB
Cancel Safety: Deferred abort with safety-net timer; timer properly disarmed via Task.isCancelled guard
Metadata Lifecycle: Socket close handler owns socketToCuSession cleanup; targeted cleanup on session replacement
Recording Fallback: recordingTracked flag ensures all untracked recordings get DB entries (missing conversation, empty summary, etc.)

Milestone PRs

M1: IPC Contract Extensions + Swift Codegen #6907: M1: IPC Contract Extensions + Swift Codegen
M2: Daemon Finalization Handler + CU Metadata Plumbing #6909: M2: Daemon Finalization Handler + CU Metadata Plumbing
M3: File-backed Attachment Schema + Content Streaming Endpoint #6911: M3: File-backed Attachment Schema + Content Streaming Endpoint
M4: macOS Recording Lifecycle + Chat Video UX #6916: M4: macOS Recording Lifecycle + Chat Video UX
M5: Retention Cleanup + Tests + Documentation #6919: M5: Retention Cleanup + Tests + Documentation

Fix PRs

Fix M1 review feedback: Swift init + codegen INT_PATTERNS #6910: Fix M1 review: Swift init + codegen INT_PATTERNS
Fix M2 + M1-fix review feedback: metadata race + latencyMs type #6912: Fix M2 + M1-fix review: metadata race + latencyMs type
Fix M3 feedback: legacy migration + RFC 7233 range clamping #6913: Fix M3 review: legacy migration + RFC 7233 range clamping
Fix M4 feedback: finalize ordering + recorder defer cleanup #6920: Fix M4 review: finalize ordering + recorder defer cleanup
Fix feedback: metadata leak on disconnect + storage_kind NOT NULL #6924: Fix: metadata leak on disconnect + storage_kind NOT NULL
Fix M5 feedback: bound cleanup interval to prevent overflow #6925: Fix M5 review: bound cleanup interval to prevent overflow
Fix cancel ordering: send abort after finalization for QA sessions #6926: Fix cancel ordering: send abort after finalization for QA sessions
Fix metadata race: let socket close handler own socketToCuSession cleanup #6931: Fix metadata race: socket close handler owns socketToCuSession cleanup
Fix cancel safety: deferred abort fallback for QA sessions #6932: Fix cancel safety: deferred abort fallback for QA sessions
Fix session replacement: remove stale socket-to-session mapping #6933: Fix session replacement: remove stale socket-to-session mapping
Fix cancel race: disarm safety-net timer before QA finalization #6934: Fix cancel race: disarm safety-net timer before QA finalization
Wire QA intent detection in daemon routing + fix epoch units #6941: Wire QA intent detection + routing metadata + epoch fix
Create file-backed attachment on QA finalize + video /content streaming #6943: File-backed attachment on finalize + video /content streaming
Wire QA mode in macOS client: create recorder when daemon signals QA intent #6945: Wire QA mode in macOS client
Fix direct QA reporting + configurable retention + orphan recording tracking #6951: Direct QA reporting + configurable retention + orphan recording tracking
Fix ARCHITECTURE.md: clarify QA recording conditional paths #6952: Fix ARCHITECTURE.md QA recording data flow accuracy
fix: recording fallback tracks all untracked recordings #6960: Fix recording fallback: track all untracked recordings
feat: wire QA recording captureScope and includeAudio config #6967: Wire captureScope and includeAudio config to client
fix: apply captureScope to resolve actual window/display for recording #6968: Apply captureScope to resolve actual window/display for recording
fix: exclude own PID from window capture resolution #6975: Exclude own PID from window capture resolution
fix: narrow QA intent detection + fix safety-net timer race #6978: Narrow QA intent regex + fix safety-net timer race

Known limitations (deferred to M3/M4)

QA intent detection is regex-based — Works for direct commands but misses nuanced requests ("smoke test onboarding"). LLM-based semantic classification planned for M3.
Capture config is daemon-config-only, not per-request — captureScope and includeAudio always come from daemon config. Per-request overrides planned for M3.
Chat injection requires conversationId — If absent on task_submit, recording is tracked for cleanup but not injected into chat. Product decision deferred.
Saved QA scripts (M3) and scheduled self-QA (M4) remain unimplemented.

Project issue

Closes #6899

Test plan

Generated with Claude Code

Jasonnnz · 2026-02-23T18:21:32Z

Addressed review feedback in #6978:

QA intent false positives: Removed 'check' from direct prefix regex and natural language patterns — 'check my email', 'check the weather' no longer trigger QA mode.
Safety-net timer race: Added guard !Task.isCancelled to properly disarm the cancel safety-net when run() cancels it.
Already fixed in prior PRs: Recording orphaning (fix: recording fallback tracks all untracked recordings #6960), attachment persistence (Create file-backed attachment on QA finalize + video /content streaming #6943), socketToCuSession cleanup (Fix metadata race: let socket close handler own socketToCuSession cleanup #6931, Fix session replacement: remove stale socket-to-session mapping #6933), expiresAt epoch mismatch (Wire QA intent detection in daemon routing + fix epoch units #6941).

…Create (#6907) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

…bing (#6909) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

…#6910) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…nt (#6911) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

…PCMessages (#6912) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

…r RFC 7233 (#6913) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

… drag support (#6916) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

… QA video (#6919) Co-authored-by: Vellum Assistant <assistant@vellum.ai>

…eanup (#6920) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

… to storage_kind (#6924) Co-authored-by: Vellum Assistant <assistant@vellum.ai>

Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

…tion (#6926) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

…metadata leak (#6931) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

…7456) Two fixes in mdfindApp: 1. Register process.terminationHandler BEFORE calling process.run() to prevent a race where mdfind exits before the handler is set, causing withCheckedContinuation to never resume and hang forever. 2. Escape single quotes in app names before interpolating into the mdfind Spotlight query string, preventing query breakage for names like "Kate's App". Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

- Add word boundary (\b) before action verb group in contextPattern to prevent partial matches like "login" matching "in" or "domain" matching "in" - Switch Cursor from simple \bcursor\b to contextPattern('cursor') so "move cursor to the submit button" no longer triggers a match - Switch Zoom from simple \bzoom\b to contextPattern('zoom') so "zoom in on the chart" no longer triggers a match - Reorder iTerm before Terminal in APP_HINTS array so "open terminal in iterm2" resolves to iTerm instead of Terminal - Add regression tests for all five false-positive scenarios Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

When qaMode is true and the frontmost guard hits its 3-consecutive-block limit, finalizeQARecording() and CuSessionAbortMessage were sent inside handleAction, then sent AGAIN by the post-loop code in run(). This caused duplicate cu_session_finalized and cu_session_abort messages to the daemon. Add a didFinalizeQARecording boolean flag that is: - Declared as a private var on the session - Reset to false at the start of run() - Set to true at the end of finalizeQARecording() - Checked in the post-loop to skip redundant finalization and abort Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Move QA intent/opt-out detection in handleUserMessage to after the secret-ingress and queue-rejection checks. Previously, the latch was mutated before those guards, so a message blocked by checkIngressForSecrets or rejected by enqueueMessage would still flip the QA latch — causing incorrect qaMode/requiresRecording state for subsequent escalations. Also adds two contract tests documenting the invariant that rejected and blocked messages must not mutate the latch. Addresses review feedback from #7424. Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

The canonical mapping ['gmail', 'mail'] incorrectly collapsed Gmail and Mail into the same canonical key. This caused taskExplicitlyRequestsCrossApp() to record only one app mention for tasks involving both Gmail and Mail, preventing the cross-app escape path from activating. Change the mapping to ['gmail', 'gmail'] so Gmail retains its own canonical identity, allowing cross-app workflows like "copy from Gmail into Mail" to be detected correctly. Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

In cross-app workflows (e.g., "copy from Chrome and paste into Safari"), the targetAppBundleId was unconditionally injected when the LLM didn't provide one. This caused the target app's bundle ID to be attached to requests for a different app, leading to wrong app activation since bundle ID takes priority in openApp resolution. Now only injects targetAppBundleId when the requested app matches the target app (or no app name is specified). Addresses review feedback from #7434. Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

The constructor log was deriving qaMode from requiresRecording, but these are independent IPC fields — a session can be in QA mode without requiring recording. Pass qaMode as a separate constructor parameter from msg.qaMode and log it directly. Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Add Bundle.main.bundleIdentifier fallback in Session.swift's inline isExternalTarget check so SPM builds (where bundleIdentifier is nil) correctly identify Vellum-targeting sessions instead of always treating them as external. Extend isExternalAppTarget in AppDelegate+Sessions.swift to accept a name parameter alongside bundleId. When bundleId is nil but targetAppName is set (e.g., Slack, Chrome), the name-based check prevents Vellum from stealing focus. Both call sites now pass routed.targetAppName. Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Process.arguments bypasses the shell, so the shell-style single-quote escaping ('\'') was passed raw to mdfind, breaking Spotlight query parsing. Switch to double-quoted query values with proper double-quote escaping instead. Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…7465) The didFinalizeQARecording flag was only set at the very end of finalizeQARecording(), missing the early return path when recording was required but no artifact was produced. This could cause the post-loop code to call finalizeQARecording() again redundantly. Use a defer block at the top of the method to guarantee the flag is always set regardless of which return path is taken. Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…7466) When targetAppBundleId is nil but targetAppName is set (e.g. "Chrome"), the inline isExternalTarget check incorrectly evaluated to false, causing the Chrome accessibility NSAlert to steal focus from the app under test. Now falls back to targetAppName matching, consistent with the external app detection logic in AppDelegate+Sessions.swift. Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Addresses review feedback on PR #7464: the sanitization only escaped double quotes but not backslashes. Since Spotlight's query parser treats `\` as an escape character inside double-quoted strings, an app name containing a backslash would produce a malformed query. Escape backslashes before double quotes to ensure correct nesting. Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

- Add strictVisualQa flag to CU session, IPC contract, and client - Block AppleScript (except target app activation) in strict mode - Block focus-stealing keys (Cmd+H/M/Tab) during focus-lock sessions - Reject completion if target app was never observed frontmost - Add first-frame handshake to ScreenRecorder for capture pipeline validation - Validate salvaged recordings for playability before tracking - Add screen recording permission gate with polling for QA sessions - Suppress title-bar minimize during focus-lock targeting Vellum - Improve activation: use activateIgnoringOtherApps, unhide, deminiaturize - Add requireExactAppMatch to ActionExecutor to prevent fuzzy matching - Tighten QA routing: require positive GUI cue, narrow overly broad patterns - Add clearAllQaLatches and call on session reset to prevent unbounded growth - Respect QA opt-out for latch-based activation in escalation handler - Add Restart menu item to menu bar - Broaden Vellum target-app hint to match bare "vellum" mention - Fix schema migration: storage_kind column to NOT NULL Co-Authored-By: Claude <noreply@anthropic.com>

devin-ai-integration

Devin Review found 1 new potential issue.

View 38 additional findings in Devin Review.

devin-ai-integration · 2026-02-24T05:13:08Z

+export const APP_HINTS: AppHintEntry[] = [
+  // Vellum (our app — highest priority)
+  {
+    patterns: [/\b(vellum|velly)\s+(desktop\s+)?app\b/, /\b(vellum|velly)\s+assistant\b/, /\bvellum\b/i],


🟡 Overly broad /\bvellum\b/i catch-all pattern causes false-positive target-app locking

The third pattern in the Vellum APP_HINTS entry (/\bvellum\b/i) matches any task text containing the standalone word "vellum", regardless of context. Because the Vellum entry is the first in the APP_HINTS array, it takes priority over all other app entries.

Root Cause & Impact

The function's own docstring states "This is intentionally conservative: only high-confidence patterns should lock the CU session to an app." However, /\bvellum\b/i is a bare-word match that fires on any mention of "vellum" — including non-app-reference contexts like:

"test Chrome's rendering of vellum.ai pages" → locks to Vellum instead of Chrome

"QA the login flow at vellum.ai in Safari" → locks to Vellum instead of Safari

"search for vellum paper types" → locks to Vellum (no app intended)

When the session is locked to the wrong app, the daemon-side isTargetAppMatch in assistant/src/daemon/computer-use-session.ts:685-706 blocks opening the intended app, and the client-side frontmost-app guard blocks actions when the correct app is in front. In strictVisualQa mode, this causes the session to fail immediately.

The other two patterns in the Vellum entry are properly contextualized (/\b(vellum|velly)\s+(desktop\s+)?app\b/ requires "app" after the name, and /\b(vellum|velly)\s+assistant\b/ requires "assistant"). Only the third pattern is problematic.

Suggested change

patterns: [/\b(vellum|velly)\s+(desktop\s+)?app\b/, /\b(vellum|velly)\s+assistant\b/, /\bvellum\b/i],

patterns: [/\b(vellum|velly)\s+(desktop\s+)?app\b/, /\b(vellum|velly)\s+assistant\b/, contextPattern('vellum')],

Was this helpful? React with 👍 or 👎 to provide feedback.

…mation # Conflicts: # assistant/src/config/schema.ts # assistant/src/daemon/ipc-contract-inventory.json # assistant/src/daemon/ipc-contract.ts # assistant/src/tools/assets/search.ts # clients/macos/vellum-assistant/ComputerUse/ActionExecutor.swift

- Add AXElementRegistry to map element IDs to AXUIElements during AX tree enumeration - Add AXActionExecutor with AX-native click/type/focus via kAXPressAction - Add FocusManager with multi-strategy retry (unhide, activate, AX raise) - Wire AX-first path in Session execute loop with CGEvent fallback - Make openApp fail-closed in strict QA mode via focusAcquireFailed error - Replace inline frontmost guard with FocusManager.acquireVerifiedFocus - Add post-open_app and post-action focus drift detection (terminal in strict mode) - Add FocusReliabilityTests covering registry, focus, executor, and strict QA Co-Authored-By: Claude <noreply@anthropic.com>

Add a bundled screen-recording SKILL.md that teaches the model about screen recording capabilities, lifecycle, retention, and limitations. Awareness-only — no tools, just contextual knowledge for the model. Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Add a bundled qa-testing skill that composes screen-recording and media-processing via includes. This is an awareness-only skill (no TOOLS.json) that teaches the model about QA workflows, strict focus management, recording lifecycle, and post-session video analysis. Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Instantiate ScreenRecorder when either qaMode or requiresRecording is true, allowing screen recording to be used independently of QA mode. Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…8021) Add optional `requiresRecording` field to `TaskSubmit` IPC message type so callers can explicitly request screen recording without triggering QA mode. The daemon routing in misc.ts now uses `msg.requiresRecording` as an override, falling back to the existing QA-based computation when the field is not set. Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

- Add RUNTIME_HTTP_PORT to daemon env forwarding list in cli/local.ts - Remove localHttpEnabled feature flag gate on RUNTIME_HTTP_PORT in AppDelegate.swift - Always include RUNTIME_HTTP_PORT with default 7821 in AssistantCli.swift env Co-Authored-By: Claude <noreply@anthropic.com>

devin-ai-integration

Devin Review found 1 new potential issue.

View 48 additional findings in Devin Review.

devin-ai-integration · 2026-02-24T19:05:06Z

+            sizeBytes: msg.recording.sizeBytes,
+            filePath: msg.recording.localPath,


🔴 cu_session_finalized allows arbitrary file-path attachment creation and file exfiltration

handleCuSessionFinalized persists file-backed attachments using the client-supplied msg.recording.localPath directly, without validating that the path is inside the recordings directory or otherwise trusted. The /v1/attachments/:id/content route later serves file-backed attachments by reading attachment.filePath from disk (assistant/src/runtime/routes/attachment-routes.ts:168-232).

Root Cause

The handler writes attacker-controlled paths into attachment metadata:

Injection path: filePath: msg.recording.localPath when creating linked attachment (assistant/src/daemon/handlers/computer-use.ts:270-271)

Fallback path: same unvalidated assignment in orphan-tracking path (assistant/src/daemon/handlers/computer-use.ts:353-354)

Because content serving trusts DB filePath, any authenticated client that can send cu_session_finalized can point an attachment at arbitrary readable files (for example, /etc/passwd or user secrets) and then fetch them via the new content endpoint.

Actual behavior: daemon accepts arbitrary localPath and later streams that file.

Expected behavior: daemon should only accept canonicalized paths within the managed recordings directory (or require a server-generated handle), rejecting out-of-root or non-recording paths.

Impact: local file disclosure / privilege boundary bypass through the attachment-content API.

Prompt for agents

In assistant/src/daemon/handlers/computer-use.ts, harden handleCuSessionFinalized so msg.recording.localPath cannot reference arbitrary files. In both createFileBackedAttachment call sites (around lines 266-274 and 349-356), canonicalize and validate the path before use: resolve realpath (or normalized absolute path), ensure it is inside the expected recordings root (from platform/data-dir config, e.g. <AppSupport>/vellum-assistant/recordings), reject symlink escapes and traversal, and verify the file exists and is a regular file. If validation fails, log a security warning and skip attachment creation. Also consider adding a helper function in this file for path validation to avoid duplication and keep behavior consistent between the primary and fallback attachment paths. Finally, add tests covering path traversal and out-of-root rejection in assistant/src/__tests__/cu-session-finalized.test.ts and/or attachment-content route tests.

Was this helpful? React with 👍 or 👎 to provide feedback.

Jasonnnz · 2026-02-24T22:41:44Z

This feature branch/PR has been quite the hill climb. Havent been able to get QA use case across due to CU not detecting. And trying to extract features out of this is proving to be extremely hard lol. Will cut my losses, extract the larger features out and do this smaller scale and just merge the branches sooner.

Still need to tweak the final QA usecase to see if we can get CU to properly handle it, but will work on bringing in screen recording capabilities first (as well as in chat viewing of recordings)

- Add recording-intent detector (recording-intent.ts) to trigger recording without requiring QA intent — "record my screen" now works standalone - Route recording-intent tasks to computer_use interaction type - Decouple reportToSessionId from qaMode so video attaches back to chat for both QA and standalone recording sessions - Only apply strictVisualQa guardrails when actually in QA mode - Add allowSelfEnumeration to AccessibilityTreeProviding so CU sessions targeting Vellum itself enumerate the correct window - Auto-detect self-targeted sessions in Session.swift init and configure the enumerator accordingly - Add self-target focus drift detection — warn the agent when frontmost app doesn't match the self-targeted session - Rename finalizeQARecording → finalizeRecording and use shouldFinalizeRecording computed property for clarity - Add conversationId to TaskSubmit IPC contract - Add channel_readiness IPC message types and snapshot tests - Update IPC generated Swift types and snapshot expectations Co-Authored-By: Claude <noreply@anthropic.com>

devin-ai-integration

Devin Review found 1 new potential issue.

View 49 additional findings in Devin Review.

devin-ai-integration · 2026-02-24T22:55:15Z

+  // Vellum (our app — highest priority)
+  {
+    patterns: [/\b(vellum|velly)\s+(desktop\s+)?app\b/, /\b(vellum|velly)\s+assistant\b/, /\bvellum\b/i],
+    appName: 'Vellum',


🟡 APP_HINTS returns appName 'Vellum' but tests and macOS app expect 'Vellum Assistant'

The Vellum entry in APP_HINTS at assistant/src/daemon/target-app-hints.ts:46 sets appName: 'Vellum', but the corresponding tests at assistant/src/__tests__/target-app-hints.test.ts:9,14,19 all expect appName: 'Vellum Assistant'. The function resolveComputerUseTargetAppHint returns { appName: entry.appName, bundleId: entry.bundleId } directly from the table, so the tests will fail.

Root Cause and Impact

The APP_HINTS table entry for Vellum uses appName: 'Vellum' but the actual macOS application name is "Vellum Assistant" (bundle ID com.vellum.vellum-assistant). Three tests explicitly assert appName: 'Vellum Assistant'.

Beyond failing tests, appName propagates through the system:

It's sent as targetAppName in CuSessionCreate and TaskRouted messages

It's injected into the CU system prompt via buildComputerUseSystemPrompt (assistant/src/config/computer-use-prompt.ts:26)

The Swift client's isExternalAppTarget at AppDelegate+Sessions.swift:79 checks both "vellum" and "vellum assistant" — returning 'Vellum' instead of 'Vellum Assistant' could cause incorrect external/internal classification

The ActionExecutor.appAliases maps "vellum" → "Vellum", but the actual .app bundle may be named "Vellum Assistant.app", causing openApp to fail to find the app in filesystem search paths

Suggested change

appName: 'Vellum',

appName: 'Vellum Assistant',

Was this helpful? React with 👍 or 👎 to provide feedback.

Jasonnnz requested a review from siddseethepalli as a code owner February 23, 2026 16:53

Jasonnnz self-assigned this Feb 23, 2026

This comment was marked as resolved.

Sign in to view

Jasonnnz force-pushed the feature/qa-video-automation branch 5 times, most recently from 57ff36f to cf0992c Compare February 23, 2026 21:44

Jasonnnz requested review from alex-nork and awlevin as code owners February 23, 2026 23:28

Jasonnnz and others added 14 commits February 23, 2026 18:43

feat: add CuSessionFinalized IPC message and QA metadata to CuSession…

4df1fee

…Create (#6907) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

feat: implement CU session finalization handler with QA metadata plum…

f04fe3d

…bing (#6909) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

fix: update IPCCuSessionCreate init and add Ms suffix to INT_PATTERNS (…

99d922d

…#6910) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

feat: add file-backed attachment storage and content streaming endpoi…

7a02a9c

…nt (#6911) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

fix: preserve CU metadata until finalization, fix latencyMs type in I…

6de8215

…PCMessages (#6912) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

fix: preserve new columns in legacy migration and clamp range ends pe…

7b4b32d

…r RFC 7233 (#6913) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

feat: add macOS screen recorder, CU session QA integration, and video…

e02dbe8

… drag support (#6916) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

feat: add retention cleanup, config, tests, and architecture docs for…

f955aa4

… QA video (#6919) Co-authored-by: Vellum Assistant <assistant@vellum.ai>

fix: finalize QA recording before abort and use defer for recorder cl…

00041b8

…eanup (#6920) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

fix: clean up cuSessionMetadata on socket disconnect and add NOT NULL…

57595f5

… to storage_kind (#6924) Co-authored-by: Vellum Assistant <assistant@vellum.ai>

fix: clamp cleanup interval to setInterval-safe maximum (#6925)

2a89a60

Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

fix: defer abort message in cancel() for QA mode until after finaliza…

ba6b6b9

…tion (#6926) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

fix: keep session in socketToCuSession until socket close to prevent …

7b3874e

…metadata leak (#6931) Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

fix: add safety-net abort in cancel() for QA mode with delay (#6932)

2cc5a0b

Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude <noreply@anthropic.com>

This comment was marked as resolved.

Sign in to view

Jasonnnz and others added 11 commits February 23, 2026 21:35

This comment was marked as resolved.

Sign in to view

Jasonnnz and others added 3 commits February 23, 2026 22:25

fix: escape backslashes in mdfind Spotlight query sanitization (#7515)

01c86e8

Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

devin-ai-integration Bot reviewed Feb 24, 2026

View reviewed changes

Vellum Assistant and others added 6 commits February 24, 2026 10:30

This comment was marked as resolved.

Sign in to view

devin-ai-integration Bot reviewed Feb 24, 2026

View reviewed changes

Jasonnnz closed this Feb 25, 2026

	patterns: [/\b(vellum\|velly)\s+(desktop\s+)?app\b/, /\b(vellum\|velly)\s+assistant\b/, /\bvellum\b/i],
	patterns: [/\b(vellum\|velly)\s+(desktop\s+)?app\b/, /\b(vellum\|velly)\s+assistant\b/, contextPattern('vellum')],

		sizeBytes: msg.recording.sizeBytes,
		filePath: msg.recording.localPath,

Conversation

Jasonnnz commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Milestone PRs

Fix PRs

Known limitations (deferred to M3/M4)

Project issue

Test plan

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Jasonnnz commented Feb 23, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Jasonnnz commented Feb 24, 2026

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Jasonnnz commented Feb 23, 2026 •

edited

Loading