Skip to content

cpow-89/Extended-Deep-Q-Learning-For-Open-AI-Gym-Environments

Repository files navigation

Extended Deep Q-Learning for Multilayer Perceptron

This project includes the code for an extended version of the Deep Q-Learning algorithm which I wrote during my Deep Reinforcement Learning Nanodegree @ Udacity. The code is inspired by the vanilla DQN algorithm provided by Udacity.

Deep Q-Learning for Multilayer Perceptron
+ Fixed Q-Targets
+ Experience Replay
+ Gradient Clipping
+ Double Deep Q-Learning
+ Dueling Networks

For more information on the implemented features refer to Extended_Deep_Q_Learning_for_Multilayer_Perceptron.ipynb. The notebook includes a summary of all essential concepts used in the code. It also contains three examples where the algorithm is used to solve Open AI gym environments.

Examples

Lunar_Lander_v2

Trained Agents1

CartPole_v1

Trained Agents2

Dependencies

  1. Create (and activate) a new environment with Python 3.6.

conda create --name env_name python=3.6
source activate env_name

  1. Install OpenAi Gym

git clone https://github.com/openai/gym.git
cd gym
pip install -e .
pip install -e '.[box2d]'
pip install -e '.[classic_control]'
sudo apt-get install ffmpeg

  1. Install Sourcecode dependencies

conda install -c rpi matplotlib
conda install -c pytorch pytorch
conda install -c anaconda numpy

Instructions

You can run the project via Extended_Deep_Q_Learning_for_Multilayer_Perceptron.ipynb or running the main.py file through the console.

open the console and run: python main.py -c "your_config_file".json optional arguments:

-h, --help

- show help message

-c , --config

- Config file name - file must be available as .json in ./configs

Example: python main.py -c "Lunar_Lander_v2".json

Config File Description

"general" :

"env_name" : "LunarLander-v2", # The gym environment name you want to run
"monitor_dir" : ["monitor"], # monitor file direction
"checkpoint_path": ["checkpoints"], # checkpoint file direction
"seed": 0, # random seed for numpy, gym and pytorch
"state_size" : 8, # number of states
"action_size" : 4, # number of actions
"average_score_for_solving" : 200.0 # border value for solving the task

"train" :

"nb_episodes": 2000, # max number of episodes
"episode_length": 1000, # max length of one episode
"batch_size" : 256, # memory batch size
"epsilon_high": 1.0, # epsilon start point
"epsilon_low": 0.01, # min epsilon value
"epsilon_decay": 0.995, # epsilon decay
"run_training" : true # do you want to train? Otherwise run a test session

"agent" :

"learning_rate": 0.0005, # model learning rate
"gamma" : 0.99, # reward weight
"tau" : 0.001, # soft update factor
"update_rate" : 4 # interval in which a learning step is done

"buffer" :

"size" : 100000 # experience replay buffer size

"model" :

"fc1_nodes" : 256, # number of fc1 output nodes
"fc2_adv" : 256, # number of fc2_adv output nodes
"fc2_val" : 128 # number of fc2_val output nodes

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published