Accelerated-TD

My implementation of the Accelerated Gradient Temporal Difference Learning algorithm (ATD) in Python.

Introduction

Agents

PlainATDAgent updates directly while SVDATDAgent and DiagonalizedSVDATDAgent update its singular value decompositions respectively which is thought to have a fewer complexity. The difference between SVDATDAgent and DiagonalizedSVDATDAgent is that SVDATDAgent employs the method mentioned here: Brand 2006, while DiagonalizedSVDATDAgent adopted the method mentioned here: Gahring 2015 which diagonalizes so that the pseudo-inverse of the matrix is more easy to calculate though I still can't figure out completely how it works.

I also implemented a conventional Gradient Temporal Difference agent called TDAgent. I tested them in several environments as introduced below.

Backend Support

I provided the backend support for PyTorch(CPU) to skip the process converting from numpy.ndarray to torch.Tensor and vice versa. You can achieve this by adding this code before importing atd module:

import os
os.environ["ATD_BACKEND"] = "NumPy"  # or "PyTorch"

To test it yourself, just clone the repository and run python algorithm_test/<random_walk or boyans_chain>.py. :)

Requirements

Python>=3.9
NumPy>=1.19
Torch>=1.10 if you want to use PyTorch as backend
Matplotlib>=3.3.3 if you want to run my test script
Tqdm if you want to run my test script

Tests

Random Walk

This environment is from Sutton's book.

The code file is this and the result is here:

Boyan's Chain

The environment was proposed in Boyan 1999.

The code file is this and the result is here:

Usage

To import my implementation of the algorithm into your project, follow these instructions if you aren't very familiar with this.

Clone the repository and copy the atd.py to where you want. If you downloaded a .zip file from GitHub, remember to unzip it.

Add this code to your Python script's head:

from atd import TDAgent, SVDATDAgent, DiagonalizedSVDATDAgent, PlainATDAgent  # or any agent you want

If the destination directory is not the same as where your main Python file is, you should use this code snippet instead of Step 2 to append the directory to the environment variable so that the Python interpreter could find it. Alternatively, you can refer to importlib provided by later Python.
```
import sys

sys.path.append("<The directory where you placed atd.py>")
from atd import TDAgent, SVDATDAgent, DiagonalizedSVDATDAgent, PlainATDAgent  # or any agent you want
```

Initialize an agent like this and you are ready to use it!

agent = TDAgent(lr=0.01, lambd=0, observation_space_n=4, action_space_n=2)

Reference: Gahring 2016

Please feel free to make a Pull Request and I'm expecting your Issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Accelerated-TD

Introduction

Agents

Backend Support

Requirements

Tests

Random Walk

Boyan's Chain

Usage

Files

README.md

Latest commit

History

README.md

File metadata and controls

Accelerated-TD

Introduction

Agents

Backend Support

Requirements

Tests

Random Walk

Boyan's Chain

Usage