Project for the Reinforcement Learning course by A. Lazaric at MVA
The objective of the project is to provide a thorough comparison among different best-arm identification algorithms in the settings of fixed budget and fixed confidence. Beside reviewing the current literature, the student is expected to produce a Matlab code which allows to easily implement and compare additional algorithms. An example of a code which could serve as a basis for this project is available at http://mloss.org/software/view/415.
The code is written in Matlab. Its structure is based on the structure of the maBandits toolbox available at http://mloss.org/software/view/415.