Skip to content

Latest commit

 

History

History
159 lines (115 loc) · 7.63 KB

README.md

File metadata and controls

159 lines (115 loc) · 7.63 KB

img_v2_4a7d4460-005b-4ab9-a316-472f873ec93g

Digital Brain 1 (DB1): A large-scale multi-modal multi-task pre-trained decision model

This repo implements a multi-modal Transformer - DB1, which is pretrained with multiple tasks, including natural language modeling, image caption, and single agent decision-making tasks (such as pixel input video games, continuous control, and TSP problems).

DB1 is also a reproduction of GATO and achieves similar performance on all of the tasks mentioned above. Specifically, on 76% of all 870 simulated decision making tasks, DB1 achieves $\ge$ 50% expert performance.

Pretraining scripts, model checkpoints, and training data will come soon.

Environment setup

Suppose you've already installed cuda and nvidia-drivers successfully.

Download files from this site. there are:

  • DB1's model checkpoint db1_870task_checkpoint/
  • Python libraries to install external_libs.tar.gz
  • Minimal data for evaluation minimal_expert_data.tar.gz

conda create -n db1 python=3.9 -y
conda activate db1

# use the version compatible with your environments.
conda install pytorch=1.12.1 cudatoolkit=11.3 -c pytorch -y 
pip install -r requirements.txt

sudo apt update && sudo apt-get install libglfw3-dev libgl1-mesa-dev libglu1-mesa-dev libglew-dev patchelf gcc -y

pip install 'gym[atari]'
autorom

# mujoco
wget https://mujoco.org/download/mujoco210-linux-x86_64.tar.gz
tar xzvf mujoco210-linux-x86_64.tar.gz
mkdir ~/.mujoco && mv mujoco210 ~/.mujoco/

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/.mujoco/mujoco210/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/nvidia
echo "export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/.mujoco/mujoco210/bin" >> ~/.bashrc
echo "export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/nvidia" >> ~/.bashrc
pip3 install -U 'mujoco-py<2.2,>=2.1'

# D4RL
tar xzvf external_libs.tar.gz

pip install -e d4rl
pip install -e metaworld

git clone https://github.com/digital-brain-sh/mycocoevalcap.git
pip install -e mycocoevalcap

# Dataset index building functions
cd src/data
make
cd -

DMLab installation

sudo apt-get install pkg-config g++ zlib1g-dev unzip libsdl2-2.0 libffi-dev gettext freeglut3-dev libsdl2-dev python3 zip libosmesa6-dev python-dev python-numpy python-pil python-enum34 python3-dev python3-numpy python3-pil

Install bazel.

Clone our DMLAB repo and build it.

git clone https://github.com/digital-brain-sh/lab
cd lab
bazel build --cxxopt="--std=c++14" -c opt --python_version=PY3 //python/pip_package:build_pip_package --verbose_failures'
# suppose you have already downloaded the dmlab package we provided
# or you can build the package with `./bazel-bin/python/pip_package/build_pip_package ~/`
pip install deepmind_lab-1.0-py3-none-any.whl

Follow the instructions here to download additional data of brady_konkle_oliva2008 for deepmind lab. In our code, default path of the data is at /raid/brady_konkle_oliva2008. If you wish to change the path, move the downloaded files into and then set $DMLAB_DATASET_PATH to [your dir]/brady_konle_oliva2008.

Data preparation

Model checkpoints

Download DB1's model checkpoint db1_870task_checkpoint from this site.

mkdir model_checkpoints
mv db1_870task_checkpoint model_checkpoints

Minimal expert demonstration dataset for RL evaluation.

Currently we only provide a minimal RL dataset as expert demonstration for evaluation. We build a minimal RL dataset acting as prompt during evaluation.

tar xzvf minimal_expert_data.tar.gz
# you will get a folder named `rl_minimal_exp_data`

Runnable scripts

Evaluation

Simulated decision making tasks

Fill in $RL_CACHE_DIR in the following script and argument of --load-dir [your model_checkpoint], for other potential checkpoints dir, you can change $TAG_NAME to load them, for detail you can see DeepSpeed Model Checkpoint.

cd [DB1's directory]
export PYTHONPATH=. # you can also use absolute path or any correct form.
sh scripts/evaluate_rl_1.3B.sh [choose a port for deepspeed, e.g. 29002]

Then the performance on all environments will be recorded in the log at rl_eval_results/db1_870task_checkpoint/results.output

Comming soon.

  • Minimal data for running TSP problems.
  • Text generation and Image Caption scripts will be released soon.
  • Finetuning results
  • Pretrained models with modern tricks like DeepNorm and etc.

Implementation details

Framework Overview

We adapt our training procedure and preprocessing logic for NLP tasks and vision tasks with Megatron-LM. To improve the data-loading efficiency, we equip data caching and lazy load for large-scale datasets. For the implementation of Transformer and its relative positional encoding and memory caching, we refer to TransformerXL. In addition, extra techniques and tricks were taken into consideration to stabilize the training procedure.

Training Overview

We use DeepSpeed to speedup the training process and scale our models. However, since DB1 is designed for tasks cross multiple modalities, we find it difficult to apply modern techniques like tensor/pipeline model parallelism. So we only use distributed data parallel during pretraining.

Currently we provide you with the checkpoint of our pretrained 1.2B model, we've tested it on DGX A100 for pretraining and evaluation, and RTX 3090 for evaluation.

Contact

If you have any questions about this repo, feel free to leave an issue.

Join Us

Get Interested in our project? Or have great passions in:

  1. Multi-Agent Learning and Game AI
  2. Operation Research and Optimization
  3. Robotics and Control
  4. Visual and Graphic Intelligence
  5. Data Mining and so on

Welcome! Why not take a look at https://digitalbrain.cn/talents?

With the leading scientists, enginneers and field experts, we are going to provide Better Decisions for Better World!

Digital Brain Laboratory

Digital Brain Laboratory, Shanghai, is co-founded by the founding partner and chairman of CMC Captital, Mr. Ruigang Li, and world-renowned scientist in the field of decision intelligence, Prof. Jun Wang.

Recruitment

Recruitment for Students & Internships