Skip to content

This is the code for "How to Do Win Slot Machines - Intro to Deep Learning #13' by Siraj Raval on YouTube

Notifications You must be signed in to change notification settings

llSourcell/how_to_win_slot_machines

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

how_to_win_slot_machines

This is the code for "How to Do Win Slot Machines - Intro to Deep Learning #13' by Siraj Raval on YouTube

Coding Challenge - Due Date - Thursday, April 13th at 12 PM PST

The coding challenge for this video is to use multiple slot machines instead of one. This way, state is taken into account. See this article for more info on this. Bonus points given for applying the code to a real world use case. You'll learn more about how policy and value functions are related in reinforcement learning by doing this exercise.

Overview

This is the code for this video on Youtube by Siraj Raval as part of the Udacity Intro to Deep Learning nanodegree. This code implements a technique called policy gradients to solve the multi-armed bandit problem. We use only 1 slot machine with 4 arms in this code and use an epsilon greedy policy to help select the best actions. Our agent is a simple 1 layer neural network built in tensorflow.

Dependencies

Usage

Run jupyter notebook in the main directory of this repository in terminal to see the code pop up in your browser.

Install jupyter here if you haven't already.

Credits

The credits for this code go to awjuliani. I've merely added a wrapper to get people started.

About

This is the code for "How to Do Win Slot Machines - Intro to Deep Learning #13' by Siraj Raval on YouTube

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published