openai-gym-taxi-v3-udacity

Attempts to solve OpenAI Gym's Taxi-v3 RL environment

The skeleton of this code is from Udacity. Their version uses Taxi-v2, but this version uses v3, since v2 is deprecated. (But I also did version 2 here.)

The environment is from here.

To do the simple demo, on Linux or Mac with Docker installed, make taxi.sh executable and run it:

git clone https://github.com/andyharless/openai-gym-taxi-v3-udacity.git
cd openai-gym-taxi-v3-udacity
chmod u+x taxi.sh
./taxi.sh

It should produce a score (best average reward of 100) of 9.26 (The output.txt file shows a sample output.)

This version uses a variation on standard Q-learning. The policy is epsilon-greedy, but when the non-greedy action is chosen, instead of being sampled from a uniform distribution, it is sampled from a distribution that reflects two things:

a preference for actions with higher Q values (i.e. "greedy but flexible")
a preference for novel actions (those that have recently been less often chosen in the current state)

The latter are tracked via a "path memory" table (same shape as the Q table), which counts how often each action is taken in each state. At the end of each episode, path memories from the previous episode decay geometrically.

The sampling distribution for stochastic actions is the softmax of a linear combination of the Q values (with a positive coefficient) and the path memory values (with a negative coefficient).

As of 2020-09-13, this solution is 1st on the Leaderboard for the v3 Taxi environment at OpenAI Gym (but I cheated by using a good seed).

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
Dockerfile		Dockerfile
README.md		README.md
agent.py		agent.py
main.py		main.py
monitor.py		monitor.py
output.txt		output.txt
taxi.sh		taxi.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

openai-gym-taxi-v3-udacity

Attempts to solve OpenAI Gym's Taxi-v3 RL environment

About

Releases

Packages

Languages

andyharless/openai-gym-taxi-v3-udacity

Folders and files

Latest commit

History

Repository files navigation

openai-gym-taxi-v3-udacity

Attempts to solve OpenAI Gym's Taxi-v3 RL environment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages