Skip to content

docs: update browser-automation guide with agent-browser as default#61

Merged
marcusquinn merged 1 commit intomainfrom
chore/browser-automation-guide-update
Jan 12, 2026
Merged

docs: update browser-automation guide with agent-browser as default#61
marcusquinn merged 1 commit intomainfrom
chore/browser-automation-guide-update

Conversation

@marcusquinn
Copy link
Owner

@marcusquinn marcusquinn commented Jan 12, 2026

Summary

  • Change default tool from dev-browser to agent-browser
  • Add visual debugging section for AI self-diagnosis
  • Add comprehensive session persistence guide
  • Add debugging checklist

Key Changes

Agent-Browser as Default

agent-browser is now the recommended first choice because:

  • Zero setup (no daemon to start)
  • AI-optimized snapshot + ref pattern
  • Multi-session isolation built-in
  • CLI-first for shell scripts and CI/CD

Visual Debugging (Don't Ask User)

New section emphasizing AI should check itself before asking user:

agent-browser screenshot /tmp/debug.png  # See current state
agent-browser errors                      # Check for errors
agent-browser console                     # View console messages
agent-browser is visible @e5              # Check element state

Session Persistence

Comprehensive guide for:

  • Saving/loading auth state
  • Cookie management
  • LocalStorage/SessionStorage
  • Multi-session with shared auth
  • Injecting tokens from environment variables

Debugging Checklist

Ordered checklist for self-diagnosis before asking user:

  1. Screenshot
  2. Errors
  3. Console
  4. URL check
  5. Snapshot
  6. Visibility
  7. Headed mode

Summary by CodeRabbit

  • Documentation
    • Enhanced browser automation guides with expanded coverage of session management, authentication state persistence, and cookie handling.
    • Improved workflow examples for multi-page interactions and form handling.
    • Added ethical guidelines section.
    • Updated command references and quick reference materials.

✏️ Tip: You can customize this high-level summary in your review settings.

- Change default tool from dev-browser to agent-browser
- Add visual debugging section (screenshots, errors, console)
- Add session persistence guide (cookies, localStorage, auth state)
- Add multi-session with shared auth patterns
- Add debugging checklist for self-diagnosis
- Emphasize checking yourself before asking user
@gemini-code-assist
Copy link

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly updates the browser automation guide by establishing agent-browser as the primary tool, highlighting its advantages for AI agents and CLI-based workflows. It introduces new, crucial sections designed to enhance the autonomy and debugging capabilities of AI agents, specifically focusing on visual self-diagnosis, robust session management, and a structured troubleshooting checklist. These changes aim to streamline automation processes and reduce the need for human intervention during common issues.

Highlights

  • Default Tool Change: The recommended default browser automation tool has been switched from dev-browser to agent-browser. This change emphasizes agent-browser's benefits, including zero setup, AI-optimized snapshot/ref patterns, multi-session isolation, and a CLI-first approach.
  • Visual Debugging Guide for AI: A new section titled 'Visual Debugging (Don't Ask User - Check Yourself)' has been added. It provides AI agents with commands for self-diagnosis, such as taking screenshots, checking errors, viewing console messages, and inspecting element states, to reduce reliance on user input for troubleshooting.
  • Comprehensive Session Persistence: Detailed documentation on session persistence has been introduced, covering how to save and load authentication states, manage cookies, handle LocalStorage and SessionStorage, utilize multi-session with shared authentication, and inject tokens from environment variables.
  • Debugging Checklist: An ordered checklist is now included to guide AI agents through a structured self-diagnosis process when automation tasks fail, promoting efficient troubleshooting before escalating issues.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@sonarqubecloud
Copy link

@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 309 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Mon Jan 12 01:04:50 UTC 2026: Code review monitoring started
Mon Jan 12 01:04:51 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 309
Mon Jan 12 01:04:51 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Mon Jan 12 01:04:53 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 309
  • VULNERABILITIES: 0

Generated on: Mon Jan 12 01:05:32 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request does a great job of updating the browser automation guide to establish agent-browser as the new default tool. The addition of detailed sections on visual debugging, session persistence, and a debugging checklist significantly improves the documentation's utility for developers. The refactoring of the alternative tools sections also enhances clarity and consistency. My review includes a few suggestions to improve the code examples for consistency and clarity.

```bash
# Setup (one-time)
bash ~/.aidevops/agents/scripts/dev-browser-helper.sh setup
~/.aidevops/agents/scripts/agent-browser-helper.sh setup

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency with other script executions in this document (e.g., on lines 307, 350, 370) and to ensure the script runs correctly even if it doesn't have execute permissions, it's best to explicitly call it with bash.

Suggested change
~/.aidevops/agents/scripts/agent-browser-helper.sh setup
bash ~/.aidevops/agents/scripts/agent-browser-helper.sh setup

# 1. Start server (if not running)
bash ~/.aidevops/agents/scripts/dev-browser-helper.sh start
# 1. Setup (one-time)
~/.aidevops/agents/scripts/agent-browser-helper.sh setup

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to other script calls in this file, it's recommended to explicitly use bash to execute this setup script for consistency and robustness.

Suggested change
~/.aidevops/agents/scripts/agent-browser-helper.sh setup
bash ~/.aidevops/agents/scripts/agent-browser-helper.sh setup

Comment on lines 348 to 358
```bash
# Quick setup
bash .agent/scripts/stagehand-python-helper.sh setup
# Setup
bash ~/.aidevops/agents/scripts/stagehand-helper.sh setup

# MCP integration
bash .agent/scripts/setup-mcp-integrations.sh stagehand-python
# Natural language actions
await stagehand.act("click the login button")
await stagehand.act("fill in the email field with user@example.com")

# Run examples
source ~/.aidevops/stagehand-python/.venv/bin/activate
python examples/basic_example.py
python examples/ecommerce_automation.py "wireless headphones"
# Structured extraction
const data = await stagehand.extract("get product prices", z.array(z.number()))
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The code block is marked as bash, but lines 353, 354, and 357 contain JavaScript syntax (await, const). This is confusing as these are not valid shell commands. To improve clarity, I suggest splitting this into two separate code blocks: a bash block for the setup command, and a javascript block to showcase the API usage examples.

@marcusquinn marcusquinn merged commit fb83249 into main Jan 12, 2026
10 of 12 checks passed
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 12, 2026

Caution

Review failed

The pull request is closed.

Walkthrough

Documentation overhaul for browser automation tooling, promoting agent-browser as the default tool. Adds comprehensive sections on session persistence, authentication state management, cookies, and multi-session workflows while updating all setup commands and example code blocks accordingly.

Changes

Cohort / File(s) Summary
Browser Automation Documentation
\.agent/tools/browser/browser-automation.md
Shifts default tool from dev-browser to agent-browser; restructures Visual Debugging and Self-diagnosis workflows; expands auth state management (save/load), cookies, LocalStorage/SessionStorage, and multi-session usage; replaces sample commands and code blocks with agent-browser equivalents; adds Ethical Guidelines section and multi-page workflow examples with sanitized forms/navigation patterns

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Poem

🌐 A browser reborn, agent-wise and keen,
Sessions persist through auth's green screen,
Cookies and storage, now first in line,
From dev to agent, the docs align. ✨


📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ca4d3c6 and f17cce8.

📒 Files selected for processing (1)
  • .agent/tools/browser/browser-automation.md

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@augmentcode
Copy link

augmentcode bot commented Jan 12, 2026

🤖 Augment PR Summary

Summary: Updates the browser automation guide to make agent-browser the default tool for most automation workflows.

Changes:

  • Switches the default recommendation from dev-browser to agent-browser and updates the decision tree + quick reference accordingly.
  • Adds a “Visual Debugging” section encouraging AI self-diagnosis via screenshots, errors, console logs, and element state checks.
  • Introduces a new “Session Persistence” guide covering auth state save/load, cookies, and storage usage (including multi-session patterns).
  • Reorganizes examples to emphasize the snapshot + ref workflow and adds a step-by-step debugging checklist.

Technical Notes: Examples use --session for isolation and assume helper scripts are available under ~/.aidevops/agents/scripts/.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 3 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

agent-browser cookies set "session_id" "abc123"

# Set cookie with options
agent-browser cookies set "auth_token" "xyz789" --domain ".example.com" --path "/" --secure
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example uses agent-browser cookies set with --domain/--path/--secure flags, but elsewhere in our docs it’s shown as cookies set <name> <val> only; if these flags aren’t supported they’ll be ignored and make debugging auth harder (also applies to the env-var cookie example below).

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

```bash
# Start new session with saved auth
agent-browser open https://app.example.com
agent-browser state load ~/.aidevops/.agent-workspace/auth/example-com.json
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agent-browser state load is presented as applying to the current session after open + reload; if state loading actually needs to happen at session/browser creation, this workflow won’t restore auth as written (also applies to the multi-session examples).

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

#### **Python Version** 🐍 **NEW**
**AI-powered browser automation with natural language control**

```bash
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fenced block is labeled bash but contains JS/TS (await stagehand.act(...)), so copying it into a shell will fail; consider using a javascript/ts fence to match the other Stagehand docs.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant