This project provides a bridge between GPT-4 and a headless Chromium browser, allowing you to automate actions simply by describing them to the program. It takes the form of a Rust CLI, but also exports most of the internals as a library for others to use.
browser-agent
is built using Rust, so you'll need to install the Rust toolchain. You can do this by following the instructions at rustup.rs.
Once you have Rust installed, you can install browser-agent
by running:
cargo install browser-agent
You should also place your OpenAI API key in the OPENAI_API_KEY
environment variable. This key should have access to the gpt-4
model.
You can copy the contents of the example.env
file to a .env
file in the root of the project, and fill in the OPENAI_API_KEY
variable. The .env
file is ignored by git, so you don't have to worry about accidentally committing your API key. Note though, .env.example
is not ignored, so you should not change that file.
Usage: browser-agent [OPTIONS] <GOAL>
Arguments:
<GOAL> The goal for the agent to achieve
Options:
--visual Whether to show the browser window. Warning: this makes the agent more unreliable
-v... Set the verbosity level, can be used multiple times
--include-page-content Whether to include text from the page in the prompt
-h, --help Print help
-V, --version Print version
This project was inspired and builds on top of Nat Friedman's natbot experiment.
This project is licensed under the MIT license. See LICENSE for more details.