rl-mdatos

This repository contains my final project for the Data Mining subject — Minería de Datos in Spanish, that's why mdatos, taught in the Master's Degree In Systems And Control Engineering at UNED (Universidad Nacional de Educación a Distancia) and UCM (Universidad Complutense de Madrid), from Spain.

It is an implementation of several tabular Reinforcement Learning algorithms, which are then applied to OpenAI Gym environments. The algorithms and environments implemented are the following:

Environment	Sarsa	Q-Learning	n-step Sarsa	Dyna-Q
NChain-v0	✔️	✔️	✔️	✔️
FrozenLake-v0	✔️	✔️	✔️	✔️
CartPole-v0	✔️	✔️	✔️	✖️
MountainCar-v0	✔️	✔️	✔️	✖️

The goal of this repo is purely educational:

For more elaborated and complicated RL algorithms, see cleanrl.
For an intuitive, easy-to-use library widely used in research, see stable-baselines3 and rl-baselines3-zoo.

A Jupyter Notebook written in Spanish that provides basic explanations of RL concepts making use of this repo can be found here.

The bibliography I used is probably the most common entry point if you want to learn Reinforcement Learning.

How to use this repo

In order to train and evaluate the agents in this repo, follow these steps:

Create and activate a virtual environment:

$ cd rl-mdatos
$ virtualenv .venv
$ source .venv/bin/activate

Install the required packages:

$ (.venv) pip install -r requirements.txt

Install this very repo in editable mode:

$ (.venv) pip install -e .

Go to the desired environment. For each environment, there's a script to train, execute and/or record a specific algorithm:

$ (.venv) cd rl_mdatos/envs/desired_env

To train a Q-Learning agent in CartPole-v0:

$ (.venv) python cp_q_learning.py --train

To execute the trained agent:

$ (.venv) python cp_q_learning.py --run

To record the execution (this only works for CartPole-v0 and MountainCar-v0):

$ (.venv) python cp_q_learning.py --run --record

3 types of files are stored in rl-mdatos/data:

logs: data generated during training, which can be visualized with tensorboard (tensorboard --logdir data/...)
trained_agents: files with final parameters of the trained agents, which are loaded at execution time.
videos: videos of the recorded episodes.

Output

After successfully training the agents, these should be the results.

NChain-v0

INFO:root:Running Q-Learning agent
INFO:root:Episode 1
INFO:root:Total reward: 9960
INFO:root:Mean reward: 9.96

FrozenLake-v0

CartPole-v0

MountainCar-v0

Bibliography

[1] Richard S. Sutton and Andrew G. Barto. Reinforcement learning: An introduction. MIT press, 2018.

[2] David Silver. Lectures on Reinforcement Learning. URL:https://www.davidsilver.uk/teaching/. 2015.

[3] Stuart J. Russell and Peter Norvig. Artificial Intelligence - A Modern Approach, Third International Edition. Pearson Education London, 2010.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
data		data
notebook		notebook
rl_mdatos		rl_mdatos
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rl-mdatos

How to use this repo

Output

NChain-v0

FrozenLake-v0

CartPole-v0

MountainCar-v0

Bibliography

About

Releases

Packages

Languages

mcres/rl-mdatos

Folders and files

Latest commit

History

Repository files navigation

rl-mdatos

How to use this repo

Output

NChain-v0

FrozenLake-v0

CartPole-v0

MountainCar-v0

Bibliography

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages