DQN Adventure: from Zero to State of the Art

This is easy-to-follow step-by-step Deep Q Learning tutorial with clean readable code.

The deep reinforcement learning community has made several independent improvements to the DQN algorithm. This tutorial presents latest extensions to the DQN algorithm in the following order:

Playing Atari with Deep Reinforcement Learning [arxiv] [code]
Deep Reinforcement Learning with Double Q-learning [arxiv] [code]
Dueling Network Architectures for Deep Reinforcement Learning [arxiv] [code]
Prioritized Experience Replay [arxiv] [code]
Noisy Networks for Exploration [arxiv] [code]
A Distributional Perspective on Reinforcement Learning [arxiv] [code]
Rainbow: Combining Improvements in Deep Reinforcement Learning [arxiv] [code]
Distributional Reinforcement Learning with Quantile Regression [arxiv] [code]
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation [arxiv] [code]
Neural Episodic Control [arxiv] [code]

Environments

First, I recommend to use small test problems to run experiments quickly. Then, you can continue on environments with large observation space.

CartPole - classic RL environment can be solved on a single cpu
Atari Pong - the easiest atari environment, only takes ~ 1 million frames to converge, comparing with other atari games that take > 40 millions
Atari others - change hyperparameters, target network update frequency=10K, replay buffer size=1M

If you get stuck…

Remember you are not stuck unless you have spent more than a week on a single algorithm. It is perfectly normal if you do not have all the required knowledge of mathematics and CS. For example, you will need knowledge of the fundamentals of measure theory and statistics, especially the Wasserstein metric and quantile regression. Statistical inference: importance sampling. Data structures: Segment Tree and K-dimensional Tree.
Carefully go through the paper. Try to see what is the problem the authors are solving. Understand a high-level idea of the approach, then read the code (skipping the proofs), and after go over the mathematical details and proofs.

Best RL courses

David Silver's course link
Berkeley deep RL link
Practical RL link

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.ipynb_checkpoints		.ipynb_checkpoints
common		common
1.dqn.ipynb		1.dqn.ipynb
2.double dqn.ipynb		2.double dqn.ipynb
3.dueling dqn.ipynb		3.dueling dqn.ipynb
4.prioritized dqn.ipynb		4.prioritized dqn.ipynb
5.noisy dqn.ipynb		5.noisy dqn.ipynb
6.categorical dqn.ipynb		6.categorical dqn.ipynb
7.rainbow dqn.ipynb		7.rainbow dqn.ipynb
8.quantile regression dqn.ipynb		8.quantile regression dqn.ipynb
9.hierarchical dqn.ipynb		9.hierarchical dqn.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DQN Adventure: from Zero to State of the Art

Environments

If you get stuck…

Best RL courses

About

Releases

Packages

Languages

limaries30/RL-Adventure

Folders and files

Latest commit

History

Repository files navigation

DQN Adventure: from Zero to State of the Art

Environments

If you get stuck…

Best RL courses

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages