This is a repo of our arXiv pre-print Agent-Driver [Project Page].
Note: Running Agent-Driver requires an OpenAI API account
Human-level driving is an ultimate goal of autonomous driving. Conventional approaches formulate autonomous driving as a perception-prediction-planning framework, yet their systems do not capitalize on the inherent reasoning ability and experiential knowledge of humans. In this paper, we propose a fundamental paradigm shift from current pipelines, exploiting Large Language Models (LLMs) as a cognitive agent to integrate human-like intelligence into autonomous driving systems. Our approach, termed Agent-Driver, transforms the traditional autonomous driving pipeline by introducing a versatile tool library accessible via function calls, a cognitive memory of common sense and experiential knowledge for decision-making, and a reasoning engine capable of chain-of-thought reasoning, task planning, motion planning, and self-reflection. Powered by LLMs, our Agent-Driver is endowed with intuitive common sense and robust reasoning capabilities, thus enabling a more nuanced, human-like approach to autonomous driving. We evaluate our approach on the large-scale nuScenes benchmark, and extensive experiments substantiate that our Agent-Driver significantly outperforms the state-of-the-art driving methods by a large margin. Our approach also demonstrates superior interpretability and few-shot learning ability to these methods.
a. Clone this repository.
git clone https://github.com/PointsCoder/Agent-Driver.git
b. Install the dependent libraries as follows:
pip install -r requirements.txt
a. We used pre-cached data from the nuScenes dataset. The data can be downloaded at Google Drive.
b. You can put the downloaded data here:
Agent-Driver
├── data
│ ├── finetune
| | |── data_samples_train.json
| | |── data_samples_val.json
│ ├── memory
| | |── database.pkl
│ ├── metrics
| | |── gt_traj.pkl
| | |── gt_traj_mask.pkl
| | |── stp3_gt_seg.pkl
| | |── uniad_gt_seg.pkl
│ ├── train
| | |── [token].pkl
| | |── ...
│ ├── val
| | |── [token].pkl
| | |── ...
│ ├── split.json
├── agentdriver
├── scripts
a. Before we start, we need to fine-tune a GPT-based motion planner (as in the reasoning engine). To do so, you first need to register an OpenAI API account.
b. After registration, you can generate your API-key and your oganization key in your account settings. Here is an example:
openai.api_key = "sk-**"
openai.organization = "org-**"
c. You need to specify your own keys in the agentdriver/llm_core/api_keys.py
, and this will be used in running Agent-Driver.
Please note that this is your own key and will be linked to your bill payment, so keep this confidential and do not distribute it to others!
d. For fine-tuning a motion planner, simply run
sh scripts/run_finetune.sh
will automatically collect data and send finetuning jobs to OpenAI. More details can be found in agentdriver/execution/fine_tune.py
.
Note: Fine-tuning costs money. Please refer to the pricing page. To save your money, by default we use 10% of the full training data for fine-tuning, from which you are supposed to get decent results with less than 10$ usd. You can get better results by using 100% data, and in this setting you may want to specify sample_ratio=1.0
in agentdriver/execution/fine_tune.py
.
d. When your fine-tune job successfully completes, you will receive an email notifying your fine-tuned GPT model id, like this
ft:gpt-3.5-turbo-0613:**::**
This model id denotes your own GPT-based motion planner. You need to specify this model id in FINETUNE_PLANNER_NAME
of agentdriver/llm_core/api_keys.py
.
a. Once all keys in agentdriver/llm_core/api_keys.py
have been set up correctly, you can inference the whole Agent-Driver pipeline in agentdriver/unit_test/test_lanuage_agent.ipynb
.
b. You can also find the usage of individual components (tool library, cognitive memory, reasoning engine) in the folder agentdriver/unit_test
.
a. If you want to evaluate the planning performance on nuScenes validation set, you can first collect the motion planning results by running
sh scripts/run_inference.sh
You will get a pred_trajs_dict.pkl
in the experiments
folder.
b. For evaluation, you can run
sh scripts/run_evaluation.sh uniad YOUR_PRED_DICT_PATH
with your pred_trajs_dict.pkl
file location.
If you find this project useful in your research, please consider citing:
@article{agentdriver,
title={A Language Agent for Autonomous Driving},
author={Mao, Jiageng and Ye, Junjie and Qian, Yuxi and Pavone, Marco and Wang, Yue},
year={2023}
}