Skip to content

Conversation

@sameelarif
Copy link
Member

@sameelarif sameelarif commented Mar 16, 2025

why

Naturally, we want to provide an agent that uses Staghand's native capabilities. This PR adds a default agent that doesn't utilize computer-use models.

what changed

Added a StagehandOperatorHandler class which provides the logic for action planning and execution.

test plan

E2E

TODO

  • Export response schemas from types.
  • Finalize which primitives are exposed to the API.
  • If the goal of the agent is data extraction, return it in the AgentResult["message"] field.
  • In the next prompt, provide the result of the previous method.

@changeset-bot
Copy link

changeset-bot bot commented Mar 16, 2025

🦋 Changeset detected

Latest commit: f2acb14

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@browserbasehq/stagehand Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@sameelarif sameelarif marked this pull request as ready for review March 16, 2025 17:22
Copy link
Contributor

@kamath kamath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code generally lgtm from reviewing on my phone, will review properly later

private stagehandPage: StagehandPage;
private logger: (message: LogLine) => void;
private llmClient: LLMClient;
messages: ChatMessage[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a fast follow PR, let's use stagehand.history here

Copy link
Collaborator

@miguelg719 miguelg719 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great

@kamath kamath merged commit c57dc19 into main Mar 18, 2025
15 checks passed
@kamath kamath deleted the sarif/stg-182-native-oo-agent-loop-in-addition-to-cua branch March 18, 2025 07:31
@github-actions github-actions bot mentioned this pull request Mar 17, 2025
},
{
type: "image_url",
image_url: { url: `data:image/png;base64,${base64Image}` },
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anthropic's format:

messages: [
      {
          "role": "user",
          "content": [
              {
                  "type": "image",
                  "source": {
                      "type": "base64",
                      "media_type": image_media_type,
                      "data": image_data,
                  },
              },
...

kamath added a commit that referenced this pull request Mar 25, 2025
* operator handler

* changeset

* Update young-dots-fry.md

* better task memory & cleaner code

* provide extraction result in reasoning

* remove action log

* make agent config optional

* increase max steps

* update close logic

* add operator example

* made handler messages private

* update operator (#596)

---------

Co-authored-by: Anirudh Kamath <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants