Distributed-DDPG

Overview

The purpose of this repository is to implement the Deep Deterministic Policy Gradient algorithm or DDPG in a distributed fashion as proposed here.

I will start by evaluating the performance of DDPG in simple cases and then comparing this performance when distributing the training process among several "workers".

MountainCarContinuous-v0 (OpenAI)

I evaluated the performance of the standard DDPG approach on the MountainCarContinuous task. The figure below shows the training curves until the problem is considered solved.

The provided results were obtained by running a single worker. To replicate the results run the following commands in two different consoles:

# Parameter server
python ddpg.py --job_name="ps" --task_index=0

# First worker
python ddpg.py --job_name="worker" --task_index=0

To visualize the training process using TensorBoard:

# TensorBoard
tensorboard --logdir=results/tboard_ddpg/

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
InvertedPendulum		InvertedPendulum
assets		assets
results		results
README.md		README.md
ddpg.py		ddpg.py
networks.py		networks.py
parameters.py		parameters.py
replay_buffer.py		replay_buffer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed-DDPG

Overview

MountainCarContinuous-v0 (OpenAI)

About

Releases

Packages

Languages

camigord/Distributed_DDPG

Folders and files

Latest commit

History

Repository files navigation

Distributed-DDPG

Overview

MountainCarContinuous-v0 (OpenAI)

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages