feat: add agent-browser support for headless browser automation CLI#59
feat: add agent-browser support for headless browser automation CLI#59marcusquinn merged 1 commit intomainfrom
Conversation
- Add agent-browser.md subagent documentation with full CLI reference - Add agent-browser-helper.sh for setup and management - Update browser-automation.md tool selection guide - Update AGENTS.md subagent folder references agent-browser is a Vercel Labs project providing: - Fast Rust CLI with Node.js fallback - Snapshot + ref pattern optimized for AI agents - Multi-session browser isolation - Cross-platform support (macOS, Linux, Windows)
WalkthroughThis pull request introduces agent-browser tooling documentation and a Bash helper script. Changes include adding agent-browser to the browser tools roster in AGENTS.md, creating comprehensive CLI documentation for agent-browser in a new markdown file, updating the browser automation guide to reference agent-browser, and introducing a setup/management script with installation, verification, and operational utilities. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly expands the project's browser automation capabilities by integrating Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Mon Jan 12 00:19:58 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
There was a problem hiding this comment.
Code Review
This pull request introduces support for agent-browser, a headless browser automation tool, by adding a new helper script, comprehensive documentation, and updating existing guides. The changes are well-structured and the documentation is clear. My review focuses on improving the new agent-browser-helper.sh script's robustness and maintainability by addressing potential issues flagged by ShellCheck, such as unquoted variables and unsafe loop constructs. I've also suggested a correction in the browser-automation.md file to ensure the setup instructions for agent-browser are complete and consistent with other tools.
| @@ -0,0 +1,311 @@ | |||
| #!/bin/bash | |||
| # shellcheck disable=SC2034,SC2155,SC2317,SC2329,SC2016,SC2181,SC1091,SC2154,SC2015,SC2086,SC2129,SC2030,SC2031,SC2119,SC2120,SC2001,SC2162,SC2088,SC2089,SC2090,SC2029,SC2006,SC2153 | |||
There was a problem hiding this comment.
Disabling a large number of ShellCheck warnings globally can hide real bugs and makes the script harder to maintain. This is especially concerning for warnings like SC2086 (unquoted variables) which point to potential robustness issues. It's better to disable specific warnings on a case-by-case basis with a comment explaining why, or preferably, to fix the underlying code to be compliant.
| local with_deps="" | ||
| if [[ "$(uname)" == "Linux" ]]; then | ||
| print_info "Linux detected - installing with system dependencies" | ||
| with_deps="--with-deps" | ||
| fi | ||
|
|
||
| if agent-browser install $with_deps; then |
There was a problem hiding this comment.
The $with_deps variable is unquoted, which can lead to unexpected behavior due to word splitting and globbing if the variable contained spaces. While it works in this specific case, it's not a robust practice. Using an array for optional arguments is a safer way to build commands.
| local with_deps="" | |
| if [[ "$(uname)" == "Linux" ]]; then | |
| print_info "Linux detected - installing with system dependencies" | |
| with_deps="--with-deps" | |
| fi | |
| if agent-browser install $with_deps; then | |
| local install_args=() | |
| if [[ "$(uname)" == "Linux" ]]; then | |
| print_info "Linux detected - installing with system dependencies" | |
| install_args+=("--with-deps") | |
| fi | |
| if agent-browser install "${install_args[@]}"; then |
| local sessions | ||
| sessions=$(agent-browser session list 2>/dev/null | grep -E '^\s*\w+' | awk '{print $1}') | ||
|
|
||
| if [[ -z "$sessions" ]]; then | ||
| print_info "No active sessions to close" | ||
| return 0 | ||
| fi | ||
|
|
||
| for session in $sessions; do | ||
| print_info "Closing session: $session" | ||
| AGENT_BROWSER_SESSION="$session" agent-browser close 2>/dev/null || true | ||
| done |
There was a problem hiding this comment.
The current method of parsing session names and iterating over them is not robust and will fail if session names contain spaces or other special characters. Using mapfile to read the session names into an array and then iterating over the array is a safer and more robust approach. This also allows for combining the grep and awk commands for better efficiency.
| local sessions | |
| sessions=$(agent-browser session list 2>/dev/null | grep -E '^\s*\w+' | awk '{print $1}') | |
| if [[ -z "$sessions" ]]; then | |
| print_info "No active sessions to close" | |
| return 0 | |
| fi | |
| for session in $sessions; do | |
| print_info "Closing session: $session" | |
| AGENT_BROWSER_SESSION="$session" agent-browser close 2>/dev/null || true | |
| done | |
| local sessions_array=() | |
| mapfile -t sessions_array < <(agent-browser session list 2>/dev/null | awk '/^\s*\w+/ {print $1}') | |
| if [[ ${#sessions_array[@]} -eq 0 ]]; then | |
| print_info "No active sessions to close" | |
| return 0 | |
| fi | |
| for session in "${sessions_array[@]}"; do | |
| print_info "Closing session: $session" | |
| AGENT_BROWSER_SESSION="$session" agent-browser close 2>/dev/null || true | |
| done |
| | Tool | Best For | Setup | | ||
| |------|----------|-------| | ||
| | **dev-browser** (DEFAULT) | Dev testing, multi-step workflows | `dev-browser-helper.sh setup` | | ||
| | **agent-browser** | CLI-first, shell scripts, CI/CD, multi-session | `npm i -g agent-browser` | |
There was a problem hiding this comment.
The setup command for agent-browser is incomplete. It only installs the CLI via npm but misses the step to install the required Chromium browser. For consistency with other tools in this table (like dev-browser), it would be better to point to the new helper script's setup command, which performs a full installation.
| | **agent-browser** | CLI-first, shell scripts, CI/CD, multi-session | `npm i -g agent-browser` | | |
| | **agent-browser** | CLI-first, shell scripts, CI/CD, multi-session | `agent-browser-helper.sh setup` | |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @.agent/scripts/agent-browser-helper.sh:
- Around line 8-10: The script currently unconditionally sources
shared-constants.sh using SCRIPT_DIR and source, which will produce an unclear
error if the file is missing; update the block to verify the file exists (test
-f or [[ -f ]]) at "${SCRIPT_DIR}/shared-constants.sh" before calling source,
and if it does not exist emit a clear error via echo or logger and exit
non‑zero; keep references to SCRIPT_DIR and the filename so the check and error
message point to the exact missing file.
- Around line 57-66: The check_chromium function currently calls the
agent-browser CLI unconditionally; first verify the agent-browser command exists
(e.g., via command -v or type) and if missing call print_warning like
"agent-browser CLI not found" and return non-zero, otherwise proceed to run
agent-browser install --check as before and handle its exit status; update
check_chromium to perform this pre-check so missing CLI yields a clear warning
instead of a "command not found" error.
🧹 Nitpick comments (4)
.agent/scripts/agent-browser-helper.sh (4)
1-2: Blanket ShellCheck suppressions mask potential issues.Disabling 18+ ShellCheck rules globally is excessive. Several of these rules catch real bugs (SC2086 for word splitting, SC2181 for exit code anti-patterns, SC1091 for missing source files). Consider using targeted inline directives only where genuinely needed, rather than a blanket suppression at the top.
98-105: Consider explicit handling for optional flag.The unquoted
$with_depson line 105 relies on word-splitting to handle the empty case. While intentional, this pattern is clearer with explicit conditional invocation.♻️ Optional improvement
- # Detect platform for deps - local with_deps="" - if [[ "$(uname)" == "Linux" ]]; then - print_info "Linux detected - installing with system dependencies" - with_deps="--with-deps" - fi - - if agent-browser install $with_deps; then + # Detect platform for deps + local install_cmd=("agent-browser" "install") + if [[ "$(uname)" == "Linux" ]]; then + print_info "Linux detected - installing with system dependencies" + install_cmd+=("--with-deps") + fi + + if "${install_cmd[@]}"; then
166-186: Session parsing is fragile and failures are silently swallowed.The grep/awk pipeline on line 172 depends on the exact output format of
agent-browser session list. If this format changes, sessions won't be closed. Additionally, using|| trueon line 181 silently ignores close failures—consider at least logging when a close fails.♻️ Suggested improvement
for session in $sessions; do print_info "Closing session: $session" - AGENT_BROWSER_SESSION="$session" agent-browser close 2>/dev/null || true + if ! AGENT_BROWSER_SESSION="$session" agent-browser close 2>/dev/null; then + print_warning "Failed to close session: $session" + fi done
188-221: Demo lacks error handling for intermediate steps.If
agent-browser open(line 199) fails, subsequent commands will produce confusing errors. For a robust demo experience, check each step's exit status and bail early with clear feedback.♻️ Suggested improvement
print_info "1. Opening example.com..." - agent-browser open https://example.com + if ! agent-browser open https://example.com; then + print_error "Failed to open page - aborting demo" + return 1 + fi echo "" print_info "2. Getting snapshot (accessibility tree)..." - agent-browser snapshot -i + if ! agent-browser snapshot -i; then + print_warning "Snapshot failed - continuing demo" + fiApply similar checks to subsequent steps as appropriate.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.agent/AGENTS.md.agent/scripts/agent-browser-helper.sh.agent/tools/browser/agent-browser.md.agent/tools/browser/browser-automation.md
🧰 Additional context used
📓 Path-based instructions (1)
.agent/scripts/*.sh
⚙️ CodeRabbit configuration file
.agent/scripts/*.sh: Automation scripts - focus on:
- Reliability and robustness
- Clear logging and feedback
- Proper exit codes
- Error recovery mechanisms
Files:
.agent/scripts/agent-browser-helper.sh
🧠 Learnings (10)
📓 Common learnings
Learnt from: CR
Repo: marcusquinn/aidevops PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-29T04:34:27.158Z
Learning: All instructions, documentation, and operational guidance should be maintained in AGENTS.md as the single source of truth
Learnt from: CR
Repo: marcusquinn/aidevops PR: 0
File: GEMINI.md:0-0
Timestamp: 2025-11-29T04:34:30.742Z
Learning: Maintain all instructions, documentation, and operational guidance in AGENTS.md as the single source of truth
📚 Learning: 2025-11-29T04:34:42.033Z
Learnt from: CR
Repo: marcusquinn/aidevops PR: 0
File: AGENT.md:0-0
Timestamp: 2025-11-29T04:34:42.033Z
Learning: Reference AGENTS.md (authoritative) instead of AGENT.md for AI assistant guidance
Applied to files:
.agent/tools/browser/browser-automation.md.agent/AGENTS.md.agent/tools/browser/agent-browser.md
📚 Learning: 2025-11-29T04:34:30.742Z
Learnt from: CR
Repo: marcusquinn/aidevops PR: 0
File: GEMINI.md:0-0
Timestamp: 2025-11-29T04:34:30.742Z
Learning: Reference AGENTS.md for authoritative AI assistant guidance instead of GEMINI.md
Applied to files:
.agent/tools/browser/browser-automation.md.agent/AGENTS.md.agent/tools/browser/agent-browser.md
📚 Learning: 2026-01-06T15:57:56.008Z
Learnt from: CR
Repo: marcusquinn/aidevops PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T15:57:56.008Z
Learning: Applies to **/AGENTS.md : Use progressive disclosure in AGENTS.md with pointers to subagents rather than inline content
Applied to files:
.agent/AGENTS.md
📚 Learning: 2025-11-29T04:34:27.158Z
Learnt from: CR
Repo: marcusquinn/aidevops PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-29T04:34:27.158Z
Learning: All instructions, documentation, and operational guidance should be maintained in AGENTS.md as the single source of truth
Applied to files:
.agent/tools/browser/agent-browser.md
📚 Learning: 2025-11-29T04:34:30.742Z
Learnt from: CR
Repo: marcusquinn/aidevops PR: 0
File: GEMINI.md:0-0
Timestamp: 2025-11-29T04:34:30.742Z
Learning: Maintain all instructions, documentation, and operational guidance in AGENTS.md as the single source of truth
Applied to files:
.agent/tools/browser/agent-browser.md
📚 Learning: 2025-11-29T04:34:42.033Z
Learnt from: CR
Repo: marcusquinn/aidevops PR: 0
File: AGENT.md:0-0
Timestamp: 2025-11-29T04:34:42.033Z
Learning: Maintain all AI assistant instructions, documentation, and operational guidance in AGENTS.md as the single source of truth
Applied to files:
.agent/tools/browser/agent-browser.md
📚 Learning: 2026-01-06T15:57:56.007Z
Learnt from: CR
Repo: marcusquinn/aidevops PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T15:57:56.007Z
Learning: Applies to **/setup.sh : Deploy agents locally using ./setup.sh script
Applied to files:
.agent/scripts/agent-browser-helper.sh
📚 Learning: 2026-01-06T15:57:56.007Z
Learnt from: CR
Repo: marcusquinn/aidevops PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T15:57:56.007Z
Learning: Applies to **/.agent/scripts/version-manager.sh : Use .agent/scripts/version-manager.sh release [major|minor|patch] for releases
Applied to files:
.agent/scripts/agent-browser-helper.sh
📚 Learning: 2026-01-06T15:57:56.007Z
Learnt from: CR
Repo: marcusquinn/aidevops PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T15:57:56.007Z
Learning: Applies to **/.agent/scripts/linters-local.sh : Run quality checks before committing using .agent/scripts/linters-local.sh
Applied to files:
.agent/scripts/agent-browser-helper.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Codacy Static Code Analysis
🔇 Additional comments (19)
.agent/tools/browser/browser-automation.md (3)
46-47: LGTM - Clean integration into decision tree.The CLI-first branch is appropriately positioned and aligns with agent-browser's primary use case. The placement between dev-browser and Playwriter follows a logical hierarchy (default → CLI → existing sessions).
60-60: LGTM - Quick reference entry is complete and accurate.The table entry concisely captures agent-browser's key characteristics (CLI-first, shell scripts, CI/CD, multi-session) and provides the correct installation command.
66-66: LGTM - Documentation reference updated correctly.The full docs line now includes
agent-browser.md, maintaining consistency with the existing pattern and providing users a clear path to detailed documentation..agent/AGENTS.md (1)
383-383: LGTM - Subagent roster updated correctly.agent-browser is now listed as a key subagent for browser automation. The placement at the beginning of the list aligns with the decision tree prioritization shown in browser-automation.md (dev-browser as default, agent-browser as CLI-first alternative).
.agent/tools/browser/agent-browser.md (11)
1-13: LGTM - Front matter configuration is correct.The subagent metadata follows the established pattern with appropriate tool permissions for a documentation-focused subagent (read-only with bash for CLI execution).
46-70: LGTM - Installation instructions follow Playwright patterns.The installation steps are well-structured with npm, Linux dependencies, and source build options. The
npx playwright install-deps chromiumcommand is valid Playwright syntax. Verification of theagent-browser installcommand is covered by the previous verification request.
72-118: LGTM - AI workflow pattern is well-documented.The snapshot + ref pattern is a strong approach for LLM-driven automation, providing deterministic element targeting similar to dev-browser's ARIA snapshots. The progressive disclosure of snapshot options (-i, -c, -d, -s) with clear examples aids adoption.
120-342: LGTM - Comprehensive CLI reference with clear examples.The command documentation is thorough and well-organized into logical categories. The syntax follows consistent patterns, and examples demonstrate practical usage. The commands align with Playwright's API surface, which is expected for a Playwright-based daemon.
344-362: LGTM - Agent and headed modes documented clearly.The
--jsonflag for machine-readable output and--headedfor debugging are standard patterns in headless browser tools. The JSON structure example is helpful for integration.
364-372: Architecture description is clear and logical.The client-daemon architecture with Rust CLI and Node.js daemon is well-explained. The persistent daemon design makes sense for fast subsequent operations. Verification of these architectural claims is covered by the earlier repository check.
374-382: LGTM - Platform support table is clear.The platform support matrix explicitly documents native Rust binary availability and Node.js fallback strategy. Windows using Node.js-only is clearly indicated, setting appropriate expectations.
384-406: LGTM - Tool comparison provides clear differentiation.The comparison table and "When to Use" guidance help users select the right tool for their use case. The characterizations align with the documentation for dev-browser, Playwriter, and Stagehand referenced elsewhere in the framework.
408-458: LGTM - Common patterns demonstrate practical workflows.The four examples (login, form submission, data extraction, multi-session) cover typical automation scenarios. The consistent use of snapshot + ref pattern reinforces the recommended workflow throughout the documentation.
460-464: LGTM - Resource links and metadata provided.The resources section provides the GitHub repository, license, and language composition. The language breakdown (74% TypeScript, 22% Rust) aligns with the documented architecture. Repository verification was requested earlier.
19-24: All documentation references are accurate and current. The GitHub repository (https://github.com/vercel-labs/agent-browser) exists and is active, the npm package (agent-browser v0.4.0) is available, and the installation instructions are correct. The architecture description ("Fast Rust CLI with Node.js fallback, Playwright-based daemon") is confirmed in the official README..agent/scripts/agent-browser-helper.sh (4)
19-42: LGTM!Clean helper functions with consistent formatting. Good practice directing error output to stderr.
115-136: LGTM!Solid setup flow with proper error propagation and helpful quick-start guidance. Clean implementation.
223-271: LGTM!Comprehensive and well-structured help documentation with clear examples covering both helper commands and direct CLI usage.
273-311: LGTM!Clean command dispatch pattern with sensible defaults and proper exit code on unknown commands.
| # Source shared constants and functions | ||
| SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" | ||
| source "${SCRIPT_DIR}/shared-constants.sh" |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add existence check before sourcing external file.
If shared-constants.sh is missing, the script will fail with an unclear error. Add a guard for better error recovery and user feedback.
🛠️ Proposed fix
# Source shared constants and functions
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+if [[ ! -f "${SCRIPT_DIR}/shared-constants.sh" ]]; then
+ echo "[ERROR] Missing required file: ${SCRIPT_DIR}/shared-constants.sh" >&2
+ exit 1
+fi
source "${SCRIPT_DIR}/shared-constants.sh"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Source shared constants and functions | |
| SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" | |
| source "${SCRIPT_DIR}/shared-constants.sh" | |
| # Source shared constants and functions | |
| SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" | |
| if [[ ! -f "${SCRIPT_DIR}/shared-constants.sh" ]]; then | |
| echo "[ERROR] Missing required file: ${SCRIPT_DIR}/shared-constants.sh" >&2 | |
| exit 1 | |
| fi | |
| source "${SCRIPT_DIR}/shared-constants.sh" |
🤖 Prompt for AI Agents
In @.agent/scripts/agent-browser-helper.sh around lines 8 - 10, The script
currently unconditionally sources shared-constants.sh using SCRIPT_DIR and
source, which will produce an unclear error if the file is missing; update the
block to verify the file exists (test -f or [[ -f ]]) at
"${SCRIPT_DIR}/shared-constants.sh" before calling source, and if it does not
exist emit a clear error via echo or logger and exit non‑zero; keep references
to SCRIPT_DIR and the filename so the check and error message point to the exact
missing file.
| # Check if Chromium is installed | ||
| check_chromium() { | ||
| if agent-browser install --check 2>/dev/null; then | ||
| print_success "Chromium browser is installed" | ||
| return 0 | ||
| else | ||
| print_warning "Chromium browser not found" | ||
| return 1 | ||
| fi | ||
| } |
There was a problem hiding this comment.
check_chromium assumes agent-browser is installed.
This function calls agent-browser install --check without first verifying the CLI is present. If agent-browser isn't installed, the user gets a confusing "command not found" error rather than a helpful message.
🛠️ Proposed fix
# Check if Chromium is installed
check_chromium() {
+ if ! command -v agent-browser &> /dev/null; then
+ print_warning "Cannot check Chromium - agent-browser not installed"
+ return 1
+ fi
if agent-browser install --check 2>/dev/null; then
print_success "Chromium browser is installed"
return 0🤖 Prompt for AI Agents
In @.agent/scripts/agent-browser-helper.sh around lines 57 - 66, The
check_chromium function currently calls the agent-browser CLI unconditionally;
first verify the agent-browser command exists (e.g., via command -v or type) and
if missing call print_warning like "agent-browser CLI not found" and return
non-zero, otherwise proceed to run agent-browser install --check as before and
handle its exit status; update check_chromium to perform this pre-check so
missing CLI yields a clear warning instead of a "command not found" error.
🤖 Augment PR SummarySummary: Adds Changes:
Technical Notes: Highlights the snapshot + ref pattern for deterministic element targeting, multi-session isolation via 🤖 Was this summary useful? React with 👍 or 👎 |
|
|
||
| # Check if Chromium is installed | ||
| check_chromium() { | ||
| if agent-browser install --check 2>/dev/null; then |
| print_info "5. Closing browser..." | ||
| agent-browser close | ||
|
|
||
| print_success "Demo complete!" |
There was a problem hiding this comment.
| | Tool | Best For | Setup | | ||
| |------|----------|-------| | ||
| | **dev-browser** (DEFAULT) | Dev testing, multi-step workflows | `dev-browser-helper.sh setup` | | ||
| | **agent-browser** | CLI-first, shell scripts, CI/CD, multi-session | `npm i -g agent-browser` | |



Summary
What is agent-browser?
agent-browser is a Vercel Labs project providing headless browser automation CLI optimized for AI agents:
--session--jsonoutput for machine parsingKey Features
Files Changed
.agent/tools/browser/agent-browser.md.agent/scripts/agent-browser-helper.sh.agent/tools/browser/browser-automation.md.agent/AGENTS.mdTesting
Summary by CodeRabbit
New Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.