Skip to content

This is the DQN implementation written by myself using OpenAI gym and keras.

Notifications You must be signed in to change notification settings

ShibiHe/DQN_OpenAI_keras

Repository files navigation

DQN_OpenAI_keras

This is the DQN implementation written by myself using OpenAI gym and keras.

The project has been officially abandoned. But! I have built another more powerful project containing DQN and episodic control, please go there and have a look Model-Free-Episodic-Control

##Description

training.py is the main script

agents.py stores DQN agent

Parameters.py stores training parameters

image_preprocessing contains image preprocessing functions

Anti_flickering.py shows that only amalgamate two frames odd and even is not enough to obliterate flickering.

action_difference.py is a script to show the difference in actions of OpenAI gym and orignial DQN paper, irrelevant to the main functions.

indexing_test.py is a script to describe indexing problem in network.

##Exploration and Discoveries ###Building dqn network

1. fixed Q targets

I build two models for Q and Q hat. I set Q hat to be untrainable. In addition I add a disconnected_grad to Q hat like https://github.com/sherjilozair/dqn has done, however I think that is unnecessary.

2. indexing

This problem is interesting. Given Q_S a matrix of batch_size * num_action, and A a matrix of batch_size * 1, we want to have Q_S[i, A[i]].

In numpy we can do:

batch_size = Q_S.shape[0]
Q_S[range(batch_size), A.reshape(batch_size)]

But in theano, compiling range(Q_S.shape[0]) will raise an error.

Two ways of solution:

  1. use theano.scan
  2. use a mask to do indexing like I did

Details are in indexing_test.py

###Anti_flickering As Anti_flickering.py described, I do not think adding two frames is enough to solve the flickering problem.

###Frame Skipping After chatting with jietang and Greg Brockman I found out that OpenAI gym has already implemented frame skipping in _step() function in https://github.com/openai/gym/blob/master/gym/envs/atari/atari_env.py

Another finding is action difference described in action_difference.py

##Notes Sometimes we need to know current lives in atari games. So I sent a pull request to OpenAI gym. openai/gym#163

##Incomplete Functions: experience relay

Prioritized Experience Replay

double DQN

dueling DQN

##References

Playing Atari with Deep Reinforcement Learning, V. Mnih et al., NIPS Workshop, 2013.

Human-level control through deep reinforcement learning, V. Mnih et al., Nature, 2015.

##Thanks

Some repositories really gave me much inspiration. They are in https://github.com/stars/ShibiHe/

About

This is the DQN implementation written by myself using OpenAI gym and keras.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages