Deep Reinforcement Learning Nanodegree Program

This repository contains most of my projects submissions and exercises answers for the Deep Reinforcement Learning Nanodegree Program.

temporal_difference/ contains implementations of the SARSA, SARSAMAX and Expected-SARSA algorithm to solve Sutton's cliff-walking environment.
taxi/ solves the 'Taxi-v3' environment. The agent obtains a best average reward of 8.83, putting it 6th on the leaderboard.
lunar_lander/ solves the 'LunarLander-v2' environment in 600 episodes (6th on the leaderboard).
navigation/ is my submission for the Banana Unity ML environment (modified version by Udacity). The agent solves the game in ~450 episodes.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
collab_compete		collab_compete
continuous_control		continuous_control
lunar_lander		lunar_lander
multi_agent		multi_agent
navigation		navigation
taxi		taxi
temporal_difference		temporal_difference
.gitignore		.gitignore
README.md		README.md

Provide feedback