Universal Value Function Approximators

This repository contains an implementation of [1]

The code is tested on a foor-room gridworld as in the paper. Only the supervised learning and the two-stage architecture is implemented.

Open notebook for explanation, and results.

To generate the dataset for learning the value function in a supervised way, run:

python lib/data.py [ouput]

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
lib		lib
.gitignore		.gitignore
README.md		README.md
data.npy		data.npy
ufva.ipynb		ufva.ipynb

Provide feedback