This project uses the Reacher environment.
In this environment, a double-jointed arm can move to target locations. A reward of +0.1 is provided for each step that the agent's hand is in the goal location. Thus, the goal of your agent is to maintain its position at the target location for as many time steps as possible.
The observation space consists of 33 variables corresponding to position, rotation, velocity, and angular velocities of the arm. Each action is a vector with four numbers, corresponding to torque applicable to two joints. Every entry in the action vector is a number between -1 and 1.
-
Install anaconda click here
-
Create (and activate) a new environment with Python 3.6.
- Linux or Mac:
conda create --name drl python=3.6 source activate drl
- Windows:
conda create --name drl python=3.6 activate drl
-
Follow the instructions in this repository to perform a minimal install of OpenAI gym.
-
Clone the repository (if you haven't already!). Then, install several dependencies.
git clone https://github.com/meiermark/rl-navigation.git
cd rl-navigation/ml-agents
pip install .
- Create an IPython kernel for the
drl
environment.
python -m ipykernel install --user --name drl --display-name "drl"
- Before running code in the notebook, change the kernel to match the
drl
environment by using the drop-downKernel
menu.
- First version contains a single agent. The corresponding docu and code can be found in
single_agent
- Second version contains 20 agents. The corresponding docu and code can be found in
multiple_agents
Please follow the README.md in single_agent
and multiple_agents
.