Implementation of the WebShop environment and search agents for the paper:
WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
Shunyu Yao*, Howard Chen*, John Yang, Karthik Narasimhan
This repository contains code for reproducing results. If you find this work useful in your research, please cite:
@inproceedings{yao2022webshop,
bibtex_show = {true},
title = {WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents},
author = {Yao, Shunyu and Chen, Howard and Yang, John and Narasimhan, Karthik},
booktitle = {ArXiv},
year = {preprint},
html = {https://arxiv.org/abs/2207.01206},
tag = {NLP}
}
WebShop is a simulated e-commerce website environment with 1.18 million real-world products and 12,087 crowd-sourced text instructions. In this environment, an agent needs to navigate multiple types of webpages and issue diverse actions to find, customize, and purchase a product given an instruction. WebShop provides several challenges including understanding compositional instructions, query (re-)formulation, dealing with noisy text in webpages, and performing strategic exploration.
Hugging Face Demo: Devise your own natural language query for a product and ask for an agent trained with WebShop to find it on Amazon or eBay, deployed as a π€ Hugging Face space here!
Our code is implemented in Python. To setup, do the following:
- Install Python 3.8.13
- Install Java
- Download the source code:
> git clone https://github.com/princeton-nlp/webshop.git webshop
- Create a virtual environment using Anaconda and activate it
> conda create -n webshop python=3.8.13
> conda activate webshop
- Install requirements into the
webshop
virtual environment via thesetup.sh
script
> ./setup.sh [-d small|all]
The setup script performs several actions in the following order:
- Installs Python dependencies listed in
requirements.txt
- Downloads product and instruction data for populating WebShop
- Downloads
spaCy en_core_web_lg
model - Construct search engine index from product, instruction data
- Downloads 50 randomly chosen trajectories generated by MTurk workers
The
-d
flag argument allows you to specify whether you would like to pull the entire product + instruction data set (-d all
) or a subset of 1000 random products (-d small
).
- By default the WebShop only loads 1,000 products for a faster environment preview. To load all products, change
web_agent_site/utils.py
:
# DEFAULT_ATTR_PATH = join(BASE_DIR, '../data/items_ins_v2_1000.json')
# DEFAULT_FILE_PATH = join(BASE_DIR, '../data/items_shuffle_1000.json')
DEFAULT_ATTR_PATH = join(BASE_DIR, '../data/items_ins_v2.json')
DEFAULT_FILE_PATH = join(BASE_DIR, '../data/items_shuffle.json')
-
(Optional) Download ResNet image feature files here and put into
data/
for running models that require image features. -
(Optional) Human demonstration data and be downloaded here.
The WebShop environment can be rendered in two modes - html
and simple
- each of which offer a different observation space. The simple
mode strips away the extraneous meta-data that the html
mode includes to make model training and evaluation easier.
Launch the WebShop
webpage:
> ./run_dev.sh
The site should then be viewable in the browser. Go to http://localhost:3000/ABC, where you should land on the search home page with a random instruction.
Navigating the website will automatically generate a corresponding trajectory file in the user_session_logs/mturk
folder. Each file corresponds to a single instruction/web session, and each step of the file corresponds to a single action (i.e. search[...]
, click[...]
).
The current WebShop build comes with two flags:
--log
: Include this flag to create a trajectory.jsonl
log file of actions on WebShop--attrs
: Include this flag to display anAttributes
tab on theitem_page
of WebShop
The simple
mode of the WebShop environment is packaged and readily available as an OpenAI environment. The OpenAI gym definitions of the text environment can be found in the web_agent_site/envs
folder.
To start using the gym and building agents that interact with the WebShop environment, include the following statements in your Python file:
import gym
from web_agent_site.envs import WebAgentTextEnv
env = gym.make('WebAgentTextEnv-v0', observation_mode='text', num_products=...)
Now, you can write your own agent that interacts with the environment via the standard OpenAI gym interface.
Examples of a RandomPolicy
agent interacting with the WebShop environment in both html
and simple
mode can be found in the run_envs
folder. To run these examples locally, run the run_web_agent_text_env.sh
or run_web_agent_site_env.sh
script:
> ./run_web_agent_text_env.sh
Products loaded.
Keys Cleaned.
Attributes Loaded.
100%|ββββββββββββββββββ| 1000/1000
Loaded 6910 goals.
Amazon Shopping Game [SEP] Instruction: [SEP] Find me slim f...
Available actions: {'has_search_bar': True, 'clickables': ['search']}
Taking action "search[shoes]" -> Reward = 0.0
...
In order to run the run_web_agent_site_env.sh
script, you must download a version of ChromeDriver compatible with your Chrome browser version. Once you have downloaded and unzipped the executable, rename it chromedriver
and place it in the webshop/envs
folder.
To run baseline models (rule, IL, RL, IL+RL) from the paper, please refer to the README.md
in the baseline_models folder.
To read more about how the sim-to-real transfer of agents trained on WebShop to other environments works, please refer to the README.md
in the transfer folder.
We would love to hear from the broader NLP and Machine Learning community, and we welcome any contributions, pull requests, or issues! To do so, please either file a new pull request or issue and fill in the corresponding templates accordingly. We'll be sure to follow up shortly!
Check LICENSE.md