Releases · prabhatnagarajan/repro_dqn

This release contains a high-quality deterministic implementation of Deep Q-learning. This implementation was used in the following works:

The Impact of Nondeterminism on Reproducibility in Deep Reinforcement Learning, a workshop paper at ICML 2018
Deterministic Implementations for Reproducibility in Deep Reinforcement Learning an arXiv preprint, and
Nondeterminism as a Reproducibility Challenge for Deep Reinforcement Learning, a master's thesis from UT.

The Impact of Nondeterminism on Reproducibility in Deep Reinforcement Learning and Nondeterminism as a Reproducibility Challenge for Deep Reinforcement Learning

For these works, we used the following hyperparameters and experimental conditions

Hyperparameters

The default hyperparameters for the deterministic implementation are specified in constants.py. The exploration and initialization seeds used were 125, 127, 129, 131, and 133. The remainder of the hyperparameters are (these parameters are explained in the DQN paper):

Network architecture: We used the architecture from the Nips paper.
Timesteps - 10 million
Minibatch size - 32
Agent history length - 4
Replay Memory Size - 1 million
Target network update frequency - 10000
Discount Factor - 0.99
Action Repeat - 4
Update Frequency - 2
Learning rate - 0.0001
Initial Exploration - 1.0
Final Exploration - 0.1
Final Exploration Frame - 1 million
Replay Start Size - 50000
No-op max - 30
Death ends episode (during training, end episode with loss of life) - False

Experimental Conditions

All of our experiments were performed under the same hardware and software conditions in AWS.

AMI - Deep Learning Base AMI (Ubuntu) Version 4.0
Instance Type - GPU Compute, p2.xlarge
Software - Installed using the install script. All other software comes with the AMI.

Unfortunately, it appears that AWS continuously updates its software, and there is no guarantee that the AMI will always be available. As such, it might not be able to replicate our results exactly. However, the deterministic implementation should function correctly under most AMIs available on AWS.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Impact of Nondeterminism on Reproducibility in Deep Reinforcement Learning and Nondeterminism as a Reproducibility Challenge for Deep Reinforcement Learning

Hyperparameters

Experimental Conditions

Releases: prabhatnagarajan/repro_dqn

V1 - Deterministic DQN

The Impact of Nondeterminism on Reproducibility in Deep Reinforcement Learning and Nondeterminism as a Reproducibility Challenge for Deep Reinforcement Learning

Hyperparameters

Experimental Conditions