This is a JAX implementation of Decision Diffuser. The code is built upon another diffusion-based offline rl algorithm, edp, which is also included in this repo.
Create python environment with conda
conda env create -f environment.yml
conda activate diffuser
pip install -e .
Apart from this, you'll have to setup your MuJoCo environment and key as well.
Run diffuser on d4rl hopper:
python train.py --config configs/diffuser_inv_hopper/diffuser_inv_hopper_mdexpert.py
Run EDP on d4rl hopper:
python train.py --config configs/dql_hopper/dql_hopper_mdexpert.py
This codebase can also log to W&B online visualization platform. To log to W&B, you first need to set your W&B API key environment variable.
Alternatively, you could simply run wandb login
.
This code repo is mainly built upon EDP. We also refer to the official pytorch implementation of decision-diffuser. The vectorized rl envionment is borrowed from tianshou.