My implementation of the Accelerated Gradient Temporal Difference Learning algorithm (ATD) in Python.
PlainATDAgent
updates directly while SVDATDAgent
and DiagonalizedSVDATDAgent
update its singular value decompositions respectively which is thought to have a fewer complexity. The difference between SVDATDAgent
and DiagonalizedSVDATDAgent
is that SVDATDAgent
employs the method mentioned here: Brand 2006, while DiagonalizedSVDATDAgent
adopted the method mentioned here: Gahring 2015 which diagonalizes so that the pseudo-inverse of the matrix is more easy to calculate though I still can't figure out completely how it works.
I also implemented a conventional Gradient Temporal Difference agent called TDAgent
. I tested them in several environments as introduced below.
I provided the backend support for PyTorch(CPU) to skip the process converting from numpy.ndarray
to torch.Tensor
and vice versa. You can achieve this by adding this code before importing atd
module:
import os
os.environ["ATD_BACKEND"] = "NumPy" # or "PyTorch"
To test it yourself, just clone the repository and run python algorithm_test/<random_walk or boyans_chain>.py
. :)
- Python>=3.9
- NumPy>=1.19
- Torch>=1.10 if you want to use PyTorch as backend
- Matplotlib>=3.3.3 if you want to run my test script
- Tqdm if you want to run my test script
This environment is from Sutton's book.
The code file is this and the result is here:
The environment was proposed in Boyan 1999.
The code file is this and the result is here:
To import my implementation of the algorithm into your project, follow these instructions if you aren't very familiar with this.
- Clone the repository and copy the
atd.py
to where you want. If you downloaded a .zip file from GitHub, remember to unzip it. - Add this code to your Python script's head:
from atd import TDAgent, SVDATDAgent, DiagonalizedSVDATDAgent, PlainATDAgent # or any agent you want
- If the destination directory is not the same as where your main Python file is, you should use this code snippet instead of Step 2 to append the directory to the environment variable so that the Python interpreter could find it. Alternatively, you can refer to
importlib
provided by later Python.import sys sys.path.append("<The directory where you placed atd.py>") from atd import TDAgent, SVDATDAgent, DiagonalizedSVDATDAgent, PlainATDAgent # or any agent you want
- Initialize an agent like this and you are ready to use it!
agent = TDAgent(lr=0.01, lambd=0, observation_space_n=4, action_space_n=2)
Reference: Gahring 2016
Please feel free to make a Pull Request and I'm expecting your Issues.