Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions assistant/src/config/bundled-skills/computer-use/TOOLS.json
Original file line number Diff line number Diff line change
Expand Up @@ -250,6 +250,10 @@
"type": "string",
"description": "The name of the application to open (e.g. \"Slack\", \"Safari\", \"Google Chrome\", \"VS Code\")"
},
"app_bundle_id": {
"type": "string",
"description": "Bundle identifier of the app (e.g. com.apple.Safari). If provided, used for precise app activation."
},
"reasoning": {
"type": "string",
"description": "Explanation of why you need to open or switch to this app"
Expand Down
5 changes: 5 additions & 0 deletions assistant/src/daemon/computer-use-session.ts
Original file line number Diff line number Diff line change
Expand Up @@ -684,6 +684,11 @@ export class ComputerUseSession {
isError: true,
};
}

// Inject targetAppBundleId when the LLM didn't provide one
if (!input.app_bundle_id && this.targetAppBundleId) {
input = { ...input, app_bundle_id: this.targetAppBundleId };
Comment on lines +689 to +690
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Skip target bundle injection for allowed cross-app steps

This defaulting logic runs whenever app_bundle_id is missing, even when taskExplicitlyRequestsCrossApp() has already allowed switching to a different app_name. In cross-app tasks, that means a step like “open Slack” can be rewritten with the target app’s bundle ID, and because openApp resolves bundle IDs before names, the executor activates the wrong app and the workflow gets stuck. Restrict this injection to cases where the requested app is still the target app.

Useful? React with 👍 / 👎.

}
Comment on lines +688 to +691
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Target app bundle ID injected for non-target app in cross-app workflows

When a user's task explicitly requests a cross-app workflow (e.g. "Copy from Chrome and paste into Safari"), the guard at lines 670-686 is bypassed because taskExplicitlyRequestsCrossApp() returns true. However, the bundle ID injection at lines 689-690 still unconditionally injects this.targetAppBundleId into the input when the LLM didn't provide one. This causes the target app's bundle ID to be attached to a request to open a different app.

Root Cause and Impact

Consider this scenario:

  • Target app is Safari (com.apple.Safari)
  • User task: "Copy text from Chrome and paste into Safari"
  • LLM calls computer_use_open_app with app_name: "Google Chrome" (no app_bundle_id)
  • The guard is bypassed because taskExplicitlyRequestsCrossApp() is true
  • Line 689: !input.app_bundle_id is true, this.targetAppBundleId is "com.apple.Safari"
  • Line 690: injects app_bundle_id: "com.apple.Safari" into the input
  • On the Swift side at ActionExecutor.swift:350-371, openApp(name: "Google Chrome", bundleId: "com.apple.Safari") is called — bundle ID resolution takes priority, so Safari is activated instead of Chrome.

The fix should only inject the bundle ID when the requested app actually matches the target app (i.e., this.isTargetAppMatch(requestedApp) is true).

Suggested change
// Inject targetAppBundleId when the LLM didn't provide one
if (!input.app_bundle_id && this.targetAppBundleId) {
input = { ...input, app_bundle_id: this.targetAppBundleId };
}
// Inject targetAppBundleId only when the requested app matches the target app
if (!input.app_bundle_id && this.targetAppBundleId && (!requestedApp || this.isTargetAppMatch(requestedApp))) {
input = { ...input, app_bundle_id: this.targetAppBundleId };
}
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

}

if (toolName === 'computer_use_run_applescript') {
Expand Down
4 changes: 4 additions & 0 deletions assistant/src/tools/computer-use/definitions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,10 @@ export const computerUseOpenAppTool: Tool = {
type: 'string',
description: 'The name of the application to open (e.g. "Slack", "Safari", "Google Chrome", "VS Code")',
},
app_bundle_id: {
type: 'string',
description: 'Bundle identifier of the app (e.g. com.apple.Safari). If provided, used for precise app activation.',
},
reasoning: {
type: 'string',
description: 'Explanation of why you need to open or switch to this app',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -235,7 +235,7 @@ final class ActionExecutor: ActionExecuting {
try drag(from: CGPoint(x: fromX, y: fromY), to: CGPoint(x: endX, y: endY))
case .openApp:
guard let appName = action.appName else { throw ExecutorError.appNotFound("(no name)") }
try await openApp(name: appName)
try await openApp(name: appName, bundleId: action.appBundleId)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Enforce target-app guard on bundle ID activations

Passing action.appBundleId directly into openApp introduces a scope bypass: the daemon’s fail-closed check only validates app_name, but bundle ID is the first resolver in openApp. A model step can therefore use a target-matching app_name plus a different app_bundle_id to activate another app in sessions that are supposed to be single-app constrained.

Useful? React with 👍 / 👎.

case .runAppleScript:
guard let source = action.script else { throw ExecutorError.appleScriptMissingScript }
return try await runAppleScript(source)
Expand Down
3 changes: 3 additions & 0 deletions clients/macos/vellum-assistant/ComputerUse/ActionTypes.swift
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ struct AgentAction: Codable {
var summary: String?
var waitDuration: Int?
var appName: String?
var appBundleId: String?
var script: String?
var reasoning: String
var resolvedFromElementId: Int?
Expand All @@ -56,6 +57,7 @@ struct AgentAction: Codable {
summary: String? = nil,
waitDuration: Int? = nil,
appName: String? = nil,
appBundleId: String? = nil,
script: String? = nil,
resolvedFromElementId: Int? = nil,
resolvedToElementId: Int? = nil,
Expand All @@ -74,6 +76,7 @@ struct AgentAction: Codable {
self.summary = summary
self.waitDuration = waitDuration
self.appName = appName
self.appBundleId = appBundleId
self.script = script
self.resolvedFromElementId = resolvedFromElementId
self.resolvedToElementId = resolvedToElementId
Expand Down
3 changes: 3 additions & 0 deletions clients/macos/vellum-assistant/ComputerUse/Session.swift
Original file line number Diff line number Diff line change
Expand Up @@ -912,6 +912,8 @@ final class ComputerUseSession: ObservableObject {
?? extractInt(from: msg.input, key: "wait_duration")
let appName = msg.input["app_name"]?.value as? String
?? msg.input["appName"]?.value as? String
let appBundleId = msg.input["app_bundle_id"]?.value as? String
?? msg.input["appBundleId"]?.value as? String
let script = msg.input["script"]?.value as? String
let elementId = extractInt(from: msg.input, key: "element_id")
?? extractInt(from: msg.input, key: "elementId")
Expand All @@ -934,6 +936,7 @@ final class ComputerUseSession: ObservableObject {
summary: summary,
waitDuration: waitDuration,
appName: appName,
appBundleId: appBundleId,
script: script,
resolvedFromElementId: elementId,
resolvedToElementId: toElementId,
Expand Down
Loading