Gandharv Patil - 260727335
Rehan Seraj - 260752605
The script reproduces example 8.2 and 8.3 of Sutton and Barto, where the environments are made using pycolab
We have altered the pycolab module so that we can pass the total time step as an argument to the environment
In order to run the codes, clone the github repo and run DynaQ.ipynb