feat: strict target-app guard + hardened openApp resolution by Jasonnnz · Pull Request #7281 · vellum-ai/vellum-assistant

Jasonnnz · 2026-02-23T23:30:08Z

Summary

Replace narrow Slack/Notion confusion blocking with fail-closed non-target blocking in CU sessions
Add cross-app escape hatch for tasks that explicitly request cross-app workflows
Harden openApp resolution: bundle-id first, then fuzzy name, then alias, then filesystem
Return structured errors to prevent click-fallback drift
Add Vellum naming variant aliases

Files changed

assistant/src/daemon/computer-use-session.ts — strict target guard
clients/macos/vellum-assistant/ComputerUse/ActionExecutor.swift — hardened openApp

Test plan

Unit: target guard blocks non-target apps
Unit: openApp bundle-id/name/alias resolution paths
Manual: "test Vellum desktop app" stays in Vellum or fails with structured error

🤖 Generated with Claude Code

Replace narrow Slack/Notion confusion blocking with fail-closed non-target blocking in CU sessions. When a target app is set, ALL non-matching open_app and run_applescript activations are blocked unless the user's original task text explicitly requests cross-app work (e.g. "copy from Chrome and paste into Vellum"). Harden openApp resolution: bundle-id first, then fuzzy name match, then alias table, then filesystem search. Add Vellum naming variant aliases ("Vellum" / "Velly" -> "Vellum Assistant"). Return structured error messages (app_not_found / app_mismatch) to prevent click-fallback drift. Log resolution path for observability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6cc809adb8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-23T23:33:12Z

+      /\bopen\s+\w+.*\band\s+(then\s+)?open\b/,
+      /\bdrag\s+from\s+\w+.*\bto\s+\w+/,
+      /\bmove\s+.*\bto\s+\w+/,
+      /\bfrom\s+\w+.*\b(into|to)\s+\w+/,


Tighten cross-app escape matching to explicit app switches

The new escape hatch pattern from ... to ... treats many single-app tasks as cross-app (for example, moving content “from column A to column B”), so taskExplicitlyRequestsCrossApp() can return true even when the user never asked to leave the target app. When that happens, the non-target open_app/AppleScript guard is disabled and the model can drift into other apps despite target scoping. Restrict this heuristic to explicit app-switch intent (or explicit app identifiers) before bypassing the guard.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-02-23T23:33:12Z

+                || normalized.contains(normalizedName)
+                || normalizedName.contains(normalized)


Guard fuzzy app matching against empty normalized input

The fuzzy matcher compares normalized.contains(normalizedName), which is always true when normalizedName is empty (e.g., app_name is whitespace or punctuation). In that case openApp activates the first running app instead of failing with app_not_found, creating arbitrary app switches from malformed tool input. Add a non-empty check before running substring-based fuzzy matches.

Useful? React with 👍 / 👎.

devin-ai-integration

Devin Review found 2 potential issues.

View 6 additional findings in Devin Review.

devin-ai-integration · 2026-02-23T23:33:15Z

+    const crossAppPatterns = [
+      /\bcopy\s+from\s+\w+.*\bpaste\s+(in|into|to)\b/,
+      /\bswitch\s+to\s+\w+/,
+      /\bopen\s+\w+.*\band\s+(then\s+)?open\b/,
+      /\bdrag\s+from\s+\w+.*\bto\s+\w+/,
+      /\bmove\s+.*\bto\s+\w+/,
+      /\bfrom\s+\w+.*\b(into|to)\s+\w+/,
+      /\buse\s+\w+.*\band\s+\w+/,
+    ];
+    return crossAppPatterns.some((p) => p.test(t));


🔴 Cross-app escape hatch patterns are overly broad, defeating the fail-closed guard

The taskExplicitlyRequestsCrossApp() method uses regex patterns that match extremely common single-app task descriptions, effectively disabling the fail-closed non-target app guard for most real-world tasks.

Root Cause and Impact

Several patterns are far too broad:

/\bswitch\s+to\s+\w+/ matches "switch to dark mode", "switch to the settings tab", "switch to the compose view"

/\bmove\s+.*\bto\s+\w+/ matches "move the file to trash", "move the cursor to the end", "move the window to the right"

/\buse\s+\w+.*\band\s+\w+/ matches "use bold and italic", "use the dropdown and select"

/\bfrom\s+\w+.*\b(into|to)\s+\w+/ matches "from the menu go to preferences", "from settings navigate to general"

Since taskExplicitlyRequestsCrossApp() is the only escape hatch checked before blocking non-target app activations (lines 485 and 507), and it returns true for most natural-language task descriptions, the fail-closed guard is effectively a no-op for the majority of tasks. This means the model can still switch to non-target apps (the exact behavior this PR is supposed to prevent).

Impact: The core security improvement of this PR — blocking all non-target app activations — is undermined. The guard will only work for very simple task descriptions that don't contain common verbs like "switch", "move", "use", or "from".

Prompt for agents

In assistant/src/daemon/computer-use-session.ts, the crossAppPatterns array at lines 259-267 in taskExplicitlyRequestsCrossApp() needs to be significantly tightened. The patterns should require explicit mention of app names or at least app-specific context, not just generic verbs. For example: 1. Remove or heavily restrict the /\bswitch\s+to\s+\w+/ pattern — it matches 'switch to dark mode'. Consider requiring it to match known app names. 2. Remove or heavily restrict /\bmove\s+.*\bto\s+\w+/ — it matches 'move the file to trash'. 3. Remove or heavily restrict /\buse\s+\w+.*\band\s+\w+/ — it matches 'use bold and italic'. 4. Remove or heavily restrict /\bfrom\s+\w+.*\b(into|to)\s+\w+/ — it matches 'from the menu go to preferences'. A better approach would be to look for explicit app name mentions (e.g., 'copy from Chrome and paste into Vellum') by maintaining a list of known app names and checking if the task mentions two different ones. Alternatively, require patterns to include app-like proper nouns (capitalized words) in the cross-app positions.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-02-23T23:33:16Z

+        let normalizedName = Self.normalizeAppName(name)
+        if let runningApp = workspace.runningApplications.first(where: { app in
+            guard let localizedName = app.localizedName else { return false }
+            let normalized = Self.normalizeAppName(localizedName)
+            return normalized == normalizedName
+                || normalized.contains(normalizedName)
+                || normalizedName.contains(normalized)
        }) {


🟡 Fuzzy name matching activates wrong app when normalized name is empty

In openApp, if the name parameter contains only non-alphanumeric characters (e.g. "---"), normalizeAppName returns an empty string, and String.contains("") returns true in Swift, causing the first running application to be activated.

Detailed Explanation

At clients/macos/vellum-assistant/ComputerUse/ActionExecutor.swift:374-381, the fuzzy matching logic is:

let normalizedName = Self.normalizeAppName(name) // could be "" if let runningApp = workspace.runningApplications.first(where: { app in guard let localizedName = app.localizedName else { return false } let normalized = Self.normalizeAppName(localizedName) return normalized == normalizedName || normalized.contains(normalizedName) // "anything".contains("") == true || normalizedName.contains(normalized) }) {

When normalizedName is "", the expression normalized.contains(normalizedName) evaluates to true for every running app with a non-nil localizedName. This means the first running application in the list gets activated, which is arbitrary and incorrect.

Impact: While unlikely to be triggered by normal model output, if the model sends a malformed app name, an arbitrary running application would be brought to the foreground instead of returning an error.

Suggested change

let normalizedName = Self.normalizeAppName(name)

if let runningApp = workspace.runningApplications.first(where: { app in

guard let localizedName = app.localizedName else { return false }

let normalized = Self.normalizeAppName(localizedName)

return normalized == normalizedName

|| normalized.contains(normalizedName)

|| normalizedName.contains(normalized)

}) {

// 2. Normalized/fuzzy name matching against running apps

let normalizedName = Self.normalizeAppName(name)

if !normalizedName.isEmpty, let runningApp = workspace.runningApplications.first(where: { app in

guard let localizedName = app.localizedName else { return false }

let normalized = Self.normalizeAppName(localizedName)

return normalized == normalizedName

|| normalized.contains(normalizedName)

|| normalizedName.contains(normalized)

}) {

Was this helpful? React with 👍 or 👎 to provide feedback.

Jasonnnz · 2026-02-23T23:42:11Z

Addressed in #7292

Replace narrow Slack/Notion confusion blocking with fail-closed non-target blocking in CU sessions. When a target app is set, ALL non-matching open_app and run_applescript activations are blocked unless the user's original task text explicitly requests cross-app work (e.g. "copy from Chrome and paste into Vellum"). Harden openApp resolution: bundle-id first, then fuzzy name match, then alias table, then filesystem search. Add Vellum naming variant aliases ("Vellum" / "Velly" -> "Vellum Assistant"). Return structured error messages (app_not_found / app_mismatch) to prevent click-fallback drift. Log resolution path for observability. Co-authored-by: Vellum Assistant <assistant@vellum.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Jasonnnz self-assigned this Feb 23, 2026

Jasonnnz merged commit cef1c9b into feature/qa-video-automation Feb 23, 2026

Jasonnnz deleted the pr1-app-target-integrity branch February 23, 2026 23:30

chatgpt-codex-connector Bot reviewed Feb 23, 2026

View reviewed changes

devin-ai-integration Bot reviewed Feb 23, 2026

View reviewed changes

Jasonnnz mentioned this pull request Feb 23, 2026

fix: tighten cross-app escape hatch + guard empty fuzzy match #7292

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: strict target-app guard + hardened openApp resolution#7281

feat: strict target-app guard + hardened openApp resolution#7281
Jasonnnz merged 1 commit into
feature/qa-video-automationfrom
pr1-app-target-integrity

Jasonnnz commented Feb 23, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Feb 23, 2026

Uh oh!

chatgpt-codex-connector Bot Feb 23, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Feb 23, 2026

Uh oh!

devin-ai-integration Bot Feb 23, 2026

Uh oh!

Jasonnnz commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		\|\| normalized.contains(normalizedName)
		\|\| normalizedName.contains(normalized)

Conversation

Jasonnnz commented Feb 23, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Files changed

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

Jasonnnz commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Jasonnnz commented Feb 23, 2026 •

edited by devin-ai-integration Bot

Loading