For this project, you will train an agent to navigate (and collect bananas!) in a large, square world.
- The state space has 37 dimensions.
- Contains the agent's velocity.
- As well as ray-based perception of objects around agent's forward direction.
0
- move forward.1
- move backward.2
- turn left.3
- turn right.
- +1: collecting a yellow banana.
- -1: for collecting a blue banana.
Goal:
- To collect as many yellow bananas as possible while avoiding blue bananas.
- It is considered solved if the agent get an average score of +13 over 100 consecutive episodes.
The repository contains the following files:
- network.py Contains simple deep neural network.
- dueling_network.py Contains a network implements Dueling Network from the paper
- dqn_agent.py Contains Q-Network agent.
- ddqn_agent.py Contains double Q-Network agent.
- ddqn_prioritized_agent.py Contains double Q-Network agent with prioritized experience replay.
- prioritized_replay_buffer.py Contains prioritized experience replay buffer implementation.
- sum_tree.py Contains a more efficient priority-based sampling structure, the implementation of which references the one from Jaromir's blog post.
- Navigation.ipynb Contains the agent training code for Unity Banana environment.
- Report.md Contains the description of the implementation details.
- Install Anaconda(https://conda.io/docs/user-guide/install/index.html)
- Install dependencies by issue:
pip install -r requirements.txt
-
Download the environment from one of the links below. You need only select the environment that matches your operating system:
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
-
Place the file in the root folder, and unzip (or decompress) the file.
Follow the steps in Navigation.ipynb
to get started with training.