Implementations of different off-policy reinforcement learning algorithms.
-
Module methods.py contains TensorFlow implementations of various neural network architectures used in value-based deep reinforcement learning.
-
Module agents.py contains general Agent class and various wrappers around it which represent corresponding deep RL algorithms.
-
Module utils.py contains Replay Buffer implementation together with a wrapper around OpenAI gym Atari 2600 environment necessary for reproducing original DeepMind results.
-
Jupyter notebook train_agents.ipynb contains examples of how to use the proposed framework to train deep RL agents on various environments.
- Deep Q-Network Volodymyr Mnih et al. "Human-level control through deep reinforcement learning." Nature (2015)
- Dueling Deep Q-Network Ziyu Wang et al. "Dueling network architectures for deep reinforcement learning." ICML (2016).
- Categorical Deep Q-Network Marc G. Bellemare, Will Dabney, and Rémi Munos. "A distributional perspective on reinforcement learning." ICML (2017).
- Quantile Regression Deep Q-Network Will Dabney, Mark Rowland, Marc G. Bellemare, and Rémi Munos. "Distributional Reinforcement Learning with Quantile Regression." AAAI (2018).
Note. Images of different neural network architectures are based on the images from the Dueling architectures paper. The original images were copied and adapted to reflect features of particular architectures and learning algorithms.