You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(browser_action): align tool definition with implementation
- Change coordinate from object to string format 'x,y@WIDTHxHEIGHT'
- Change size from object to string format 'WIDTHxHEIGHT'
- Update required params to only include 'action' (others are conditionally required)
- Add missing 'press' action to enum
- Update text parameter description to clarify usage for both type and press actions
Fixes issue where assistant was correctly following schema but receiving 'Missing value for required parameter' errors.
Copy file name to clipboardExpand all lines: src/core/prompts/tools/native-tools/browser_action.ts
+10-32Lines changed: 10 additions & 32 deletions
Original file line number
Diff line number
Diff line change
@@ -5,59 +5,37 @@ export default {
5
5
function: {
6
6
name: "browser_action",
7
7
description:
8
-
"Interact with a Puppeteer-controlled browser session. Always start by launching at a URL and always finish by closing the browser. While the browser is active, do not call any other tools. Use coordinates within the viewport to hover or click, provide text for typing, and ensure actions are grounded in the latest screenshot and console logs.",
8
+
"Interact with a browser session. Always start by launching at a URL and always finish by closing the browser. While the browser is active, do not call any other tools. Use coordinates within the viewport to hover or click, provide text for typing, and ensure actions are grounded in the latest screenshot and console logs.",
description: "URL to open when performing the launch action; must include protocol",
21
21
},
22
22
coordinate: {
23
-
type: ["object","null"],
23
+
type: ["string","null"],
24
24
description:
25
-
"Screen coordinate for hover or click actions; target the center of the desired element",
26
-
properties: {
27
-
x: {
28
-
type: "number",
29
-
description: "Horizontal pixel position within the current viewport",
30
-
},
31
-
y: {
32
-
type: "number",
33
-
description: "Vertical pixel position within the current viewport",
34
-
},
35
-
},
36
-
required: ["x","y"],
37
-
additionalProperties: false,
25
+
"Screen coordinate for hover or click actions in format 'x,y@WIDTHxHEIGHT' where x,y is the target position on the screenshot image and WIDTHxHEIGHT is the exact pixel dimensions of the screenshot image (not the browser viewport). Example: '450,203@900x600' means click at (450,203) on a 900x600 screenshot. The coordinates will be automatically scaled to match the actual viewport dimensions.",
38
26
},
39
27
size: {
40
-
type: ["object","null"],
41
-
description: "Viewport dimensions to apply when performing the resize action",
42
-
properties: {
43
-
width: {
44
-
type: "number",
45
-
description: "Viewport width in pixels",
46
-
},
47
-
height: {
48
-
type: "number",
49
-
description: "Viewport height in pixels",
50
-
},
51
-
},
52
-
required: ["width","height"],
53
-
additionalProperties: false,
28
+
type: ["string","null"],
29
+
description:
30
+
"Viewport dimensions for the resize action in format 'WIDTHxHEIGHT' or 'WIDTH,HEIGHT'. Example: '1280x800' or '1280,800'",
54
31
},
55
32
text: {
56
33
type: ["string","null"],
57
-
description: "Text to type when performing the type action",
34
+
description:
35
+
"Text to type when performing the type action, or key name to press when performing the press action (e.g., 'Enter', 'Tab', 'Escape')",
0 commit comments