In this paper we re-define MAXQ and the taxi environment and Implement them in R. We then apply Qlearning to the same problem. Our conclusion is that MAXQ works as good as Qlearning for this problem. Our aim is illustrate the advantages of using hierarchical reinforcement learning methods.
The papaer can be found above as Exploration_and_Implementation_of_the_MAXQ_andQlearning_algorithms_for_the_taxi_problem.pdf or accessed through ResearchGate
Code structure can be viewed here.