This repository contains a Deep Deterministic Policy Gradients (DDPG) agent running in the Unity ML Agent Tennis (https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Learning-Environment-Examples.md#tennis) environment. It can be used to train and evaluate the result of the training.
It is an extension of my previous project Learning Continuous Control in Deep Reinforcement Learning to try out the "AlphaGo Zero" like competitive agents training in which 2 agents knows nothing about the rules at the beginning and gradually gains experience and rewards by playing with each other.
The DDPG agent is implemented in Python 3 using PyTorch.
The provided model weights is trained in AWS EC2 p2.xlarge GPU instance in about 1 hr 15 mins.
The 3D environment contains 2 tennis agents who can move forward, backward or jump.
The goal is to bounce the ball across the net to other side while not dropping it and the ball is still within the bounds.
The environment is considered solved when the average max score of either agent reach 0.5+ in the last 100 epsisodes.
- A reward of +0.1 is provided for each step that the agent bounces the ball to other side successfully.
- A reward of -0.1 is provided for each step that the agent let the ball drop or out of the bounds.
Vector Action space: (Continuous) Size of 2, corresponding to moving forward, backward or jump.
The observation space is composed of 8 variables:
position, velocity of ball and velocity
-
Clone this Git repository https://github.com/kinwo/deeprl-tennis-competition
-
Install Unity ML https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Installation.md
-
Download the Unity ML environment from one of the links below based on your OS:
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
Then unzip the file and place the file in the project folder.
- Create Conda Environment
Install conda from https://conda.io. Create a new Conda environment with Python 3.6.
conda create --name deeprl python=3.6
source activate deeprl
- Install Dependencies
cd python
pip install .
Start Jupyter Notebook
jupyter notebook
To start training, simply open Tennis.ipynb in Jupyter Notebook and follow the instructions there:
Trained model weights is included for quickly running the agent and seeing the result in Unity ML Agent. Simply skip the training step and run the last step of the Tennis.ipynb