Deep Reinforcement Learning

Here I am implementing various RL algorithms, using python 2.7. I will use keras for the neurals nets. I'm going to use the OpenAI gym to test the algorithms. I list the methods below, which roughly divide into two categories.

I took / adjusted code from various online sources, which I inexhaustively list below (and in the code itself).

Value based methods

Policy based methods

Policy gradient -- REINFORCE & with baseline.
Actor critic (A2C)
Deep Deterministic Policy Gradient (DDPG)
Proximal policy optimization (PPO)
Soft Actor-Critic (soft AC)

Multi-agent

Muti-agent deep deterministic policy gradient (MADDPG)
Actor-Attention-Critic (AAC)
Value Decompostion Networks (VDN)
QMIX

Others

Explore-and-go
Curiosity driven learning (CDL)
Rainbow (RB)

Resources

Papers

Blogs

Textbooks

Sutton

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
A2C		A2C
DDPG		DDPG
DDQN		DDQN
DQN		DQN
PPO		PPO
Q_learning		Q_learning
dueling_DQN		dueling_DQN
policy_gradients		policy_gradients
prioritised_replay		prioritised_replay
soft_A2C		soft_A2C
.gitignore		.gitignore
README.md		README.md
learn_tensorboard.ipynb		learn_tensorboard.ipynb
roughwork.ipynb		roughwork.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Reinforcement Learning

Value based methods

Policy based methods

Multi-agent

Others

Resources

Papers

Blogs

Textbooks

Acknowledgements

About

Releases

Packages

Languages

Khev/RL-practice-keras

Folders and files

Latest commit

History

Repository files navigation

Deep Reinforcement Learning

Value based methods

Policy based methods

Multi-agent

Others

Resources

Papers

Blogs

Textbooks

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages