Extended Deep Q-Learning for Multilayer Perceptron

This project includes the code for an extended version of the Deep Q-Learning algorithm which I wrote during my Deep Reinforcement Learning Nanodegree @ Udacity. The code is inspired by the vanilla DQN algorithm provided by Udacity.

Deep Q-Learning for Multilayer Perceptron
+ Fixed Q-Targets
+ Experience Replay
+ Gradient Clipping
+ Double Deep Q-Learning
+ Dueling Networks

For more information on the implemented features refer to Extended_Deep_Q_Learning_for_Multilayer_Perceptron.ipynb. The notebook includes a summary of all essential concepts used in the code. It also contains three examples where the algorithm is used to solve Open AI gym environments.

Examples

Lunar_Lander_v2

CartPole_v1

Dependencies

Create (and activate) a new environment with Python 3.6.

conda create --name env_name python=3.6
source activate env_name

Install OpenAi Gym

git clone https://github.com/openai/gym.git
cd gym
pip install -e .
pip install -e '.[box2d]'
pip install -e '.[classic_control]'
sudo apt-get install ffmpeg

Install Sourcecode dependencies

conda install -c rpi matplotlib
conda install -c pytorch pytorch
conda install -c anaconda numpy

Instructions

You can run the project via Extended_Deep_Q_Learning_for_Multilayer_Perceptron.ipynb or running the main.py file through the console.

open the console and run: python main.py -c "your_config_file".json optional arguments:

-h, --help

- show help message

-c , --config

- Config file name - file must be available as .json in ./configs

Example: python main.py -c "Lunar_Lander_v2".json

Config File Description

"general" :

"env_name" : "LunarLander-v2", # The gym environment name you want to run
"monitor_dir" : ["monitor"], # monitor file direction
"checkpoint_path": ["checkpoints"], # checkpoint file direction
"seed": 0, # random seed for numpy, gym and pytorch
"state_size" : 8, # number of states
"action_size" : 4, # number of actions
"average_score_for_solving" : 200.0 # border value for solving the task

"train" :

"nb_episodes": 2000, # max number of episodes
"episode_length": 1000, # max length of one episode
"batch_size" : 256, # memory batch size
"epsilon_high": 1.0, # epsilon start point
"epsilon_low": 0.01, # min epsilon value
"epsilon_decay": 0.995, # epsilon decay
"run_training" : true # do you want to train? Otherwise run a test session

"agent" :

"learning_rate": 0.0005, # model learning rate
"gamma" : 0.99, # reward weight
"tau" : 0.001, # soft update factor
"update_rate" : 4 # interval in which a learning step is done

"buffer" :

"size" : 100000 # experience replay buffer size

"model" :

"fc1_nodes" : 256, # number of fc1 output nodes
"fc2_adv" : 256, # number of fc2_adv output nodes
"fc2_val" : 128 # number of fc2_val output nodes

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
checkpoints		checkpoints
configs		configs
images		images
monitor		monitor
.gitignore		.gitignore
Extended_Deep_Q_Learning_for_Multilayer_Perceptron.ipynb		Extended_Deep_Q_Learning_for_Multilayer_Perceptron.ipynb
README.md		README.md
dqn_agent.py		dqn_agent.py
experience_replay.py		experience_replay.py
helper.py		helper.py
main.py		main.py
model.py		model.py
sessions.py		sessions.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Extended Deep Q-Learning for Multilayer Perceptron

Examples

Lunar_Lander_v2

CartPole_v1

Dependencies

Instructions

Config File Description

About

Releases

Packages

Languages

cpow-89/Extended-Deep-Q-Learning-For-Open-AI-Gym-Environments

Folders and files

Latest commit

History

Repository files navigation

Extended Deep Q-Learning for Multilayer Perceptron

Examples

Lunar_Lander_v2

CartPole_v1

Dependencies

Instructions

Config File Description

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages