feat: strict target-app guard + hardened openApp resolution#7281
Conversation
Replace narrow Slack/Notion confusion blocking with fail-closed
non-target blocking in CU sessions. When a target app is set, ALL
non-matching open_app and run_applescript activations are blocked
unless the user's original task text explicitly requests cross-app
work (e.g. "copy from Chrome and paste into Vellum").
Harden openApp resolution: bundle-id first, then fuzzy name match,
then alias table, then filesystem search. Add Vellum naming variant
aliases ("Vellum" / "Velly" -> "Vellum Assistant"). Return structured
error messages (app_not_found / app_mismatch) to prevent click-fallback
drift. Log resolution path for observability.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6cc809adb8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| /\bopen\s+\w+.*\band\s+(then\s+)?open\b/, | ||
| /\bdrag\s+from\s+\w+.*\bto\s+\w+/, | ||
| /\bmove\s+.*\bto\s+\w+/, | ||
| /\bfrom\s+\w+.*\b(into|to)\s+\w+/, |
There was a problem hiding this comment.
Tighten cross-app escape matching to explicit app switches
The new escape hatch pattern from ... to ... treats many single-app tasks as cross-app (for example, moving content “from column A to column B”), so taskExplicitlyRequestsCrossApp() can return true even when the user never asked to leave the target app. When that happens, the non-target open_app/AppleScript guard is disabled and the model can drift into other apps despite target scoping. Restrict this heuristic to explicit app-switch intent (or explicit app identifiers) before bypassing the guard.
Useful? React with 👍 / 👎.
| || normalized.contains(normalizedName) | ||
| || normalizedName.contains(normalized) |
There was a problem hiding this comment.
Guard fuzzy app matching against empty normalized input
The fuzzy matcher compares normalized.contains(normalizedName), which is always true when normalizedName is empty (e.g., app_name is whitespace or punctuation). In that case openApp activates the first running app instead of failing with app_not_found, creating arbitrary app switches from malformed tool input. Add a non-empty check before running substring-based fuzzy matches.
Useful? React with 👍 / 👎.
| const crossAppPatterns = [ | ||
| /\bcopy\s+from\s+\w+.*\bpaste\s+(in|into|to)\b/, | ||
| /\bswitch\s+to\s+\w+/, | ||
| /\bopen\s+\w+.*\band\s+(then\s+)?open\b/, | ||
| /\bdrag\s+from\s+\w+.*\bto\s+\w+/, | ||
| /\bmove\s+.*\bto\s+\w+/, | ||
| /\bfrom\s+\w+.*\b(into|to)\s+\w+/, | ||
| /\buse\s+\w+.*\band\s+\w+/, | ||
| ]; | ||
| return crossAppPatterns.some((p) => p.test(t)); |
There was a problem hiding this comment.
🔴 Cross-app escape hatch patterns are overly broad, defeating the fail-closed guard
The taskExplicitlyRequestsCrossApp() method uses regex patterns that match extremely common single-app task descriptions, effectively disabling the fail-closed non-target app guard for most real-world tasks.
Root Cause and Impact
Several patterns are far too broad:
/\bswitch\s+to\s+\w+/matches "switch to dark mode", "switch to the settings tab", "switch to the compose view"/\bmove\s+.*\bto\s+\w+/matches "move the file to trash", "move the cursor to the end", "move the window to the right"/\buse\s+\w+.*\band\s+\w+/matches "use bold and italic", "use the dropdown and select"/\bfrom\s+\w+.*\b(into|to)\s+\w+/matches "from the menu go to preferences", "from settings navigate to general"
Since taskExplicitlyRequestsCrossApp() is the only escape hatch checked before blocking non-target app activations (lines 485 and 507), and it returns true for most natural-language task descriptions, the fail-closed guard is effectively a no-op for the majority of tasks. This means the model can still switch to non-target apps (the exact behavior this PR is supposed to prevent).
Impact: The core security improvement of this PR — blocking all non-target app activations — is undermined. The guard will only work for very simple task descriptions that don't contain common verbs like "switch", "move", "use", or "from".
Prompt for agents
In assistant/src/daemon/computer-use-session.ts, the crossAppPatterns array at lines 259-267 in taskExplicitlyRequestsCrossApp() needs to be significantly tightened. The patterns should require explicit mention of app names or at least app-specific context, not just generic verbs. For example:
1. Remove or heavily restrict the /\bswitch\s+to\s+\w+/ pattern — it matches 'switch to dark mode'. Consider requiring it to match known app names.
2. Remove or heavily restrict /\bmove\s+.*\bto\s+\w+/ — it matches 'move the file to trash'.
3. Remove or heavily restrict /\buse\s+\w+.*\band\s+\w+/ — it matches 'use bold and italic'.
4. Remove or heavily restrict /\bfrom\s+\w+.*\b(into|to)\s+\w+/ — it matches 'from the menu go to preferences'.
A better approach would be to look for explicit app name mentions (e.g., 'copy from Chrome and paste into Vellum') by maintaining a list of known app names and checking if the task mentions two different ones. Alternatively, require patterns to include app-like proper nouns (capitalized words) in the cross-app positions.
Was this helpful? React with 👍 or 👎 to provide feedback.
| let normalizedName = Self.normalizeAppName(name) | ||
| if let runningApp = workspace.runningApplications.first(where: { app in | ||
| guard let localizedName = app.localizedName else { return false } | ||
| let normalized = Self.normalizeAppName(localizedName) | ||
| return normalized == normalizedName | ||
| || normalized.contains(normalizedName) | ||
| || normalizedName.contains(normalized) | ||
| }) { |
There was a problem hiding this comment.
🟡 Fuzzy name matching activates wrong app when normalized name is empty
In openApp, if the name parameter contains only non-alphanumeric characters (e.g. "---"), normalizeAppName returns an empty string, and String.contains("") returns true in Swift, causing the first running application to be activated.
Detailed Explanation
At clients/macos/vellum-assistant/ComputerUse/ActionExecutor.swift:374-381, the fuzzy matching logic is:
let normalizedName = Self.normalizeAppName(name) // could be ""
if let runningApp = workspace.runningApplications.first(where: { app in
guard let localizedName = app.localizedName else { return false }
let normalized = Self.normalizeAppName(localizedName)
return normalized == normalizedName
|| normalized.contains(normalizedName) // "anything".contains("") == true
|| normalizedName.contains(normalized)
}) {When normalizedName is "", the expression normalized.contains(normalizedName) evaluates to true for every running app with a non-nil localizedName. This means the first running application in the list gets activated, which is arbitrary and incorrect.
Impact: While unlikely to be triggered by normal model output, if the model sends a malformed app name, an arbitrary running application would be brought to the foreground instead of returning an error.
| let normalizedName = Self.normalizeAppName(name) | |
| if let runningApp = workspace.runningApplications.first(where: { app in | |
| guard let localizedName = app.localizedName else { return false } | |
| let normalized = Self.normalizeAppName(localizedName) | |
| return normalized == normalizedName | |
| || normalized.contains(normalizedName) | |
| || normalizedName.contains(normalized) | |
| }) { | |
| // 2. Normalized/fuzzy name matching against running apps | |
| let normalizedName = Self.normalizeAppName(name) | |
| if !normalizedName.isEmpty, let runningApp = workspace.runningApplications.first(where: { app in | |
| guard let localizedName = app.localizedName else { return false } | |
| let normalized = Self.normalizeAppName(localizedName) | |
| return normalized == normalizedName | |
| || normalized.contains(normalizedName) | |
| || normalizedName.contains(normalized) | |
| }) { |
Was this helpful? React with 👍 or 👎 to provide feedback.
|
Addressed in #7292 |
Replace narrow Slack/Notion confusion blocking with fail-closed
non-target blocking in CU sessions. When a target app is set, ALL
non-matching open_app and run_applescript activations are blocked
unless the user's original task text explicitly requests cross-app
work (e.g. "copy from Chrome and paste into Vellum").
Harden openApp resolution: bundle-id first, then fuzzy name match,
then alias table, then filesystem search. Add Vellum naming variant
aliases ("Vellum" / "Velly" -> "Vellum Assistant"). Return structured
error messages (app_not_found / app_mismatch) to prevent click-fallback
drift. Log resolution path for observability.
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Files changed
assistant/src/daemon/computer-use-session.ts— strict target guardclients/macos/vellum-assistant/ComputerUse/ActionExecutor.swift— hardened openAppTest plan
🤖 Generated with Claude Code