This repository contains the write-up inluding all the code as a Jupyter Notebook of a recent project which explored auctions using Multi-Agent Reinforcement Learning. The project is Co-Authored by Edward Plumb a PhD student in the Mathematics department at the LSE who is a game theory expert and whom I fully credit with the mathematical intuition underying the project.
The document is very long and extensive. But it broadly compares two types of auctions the ascending (English) auction and descending (Dutch) auctions and compares two classes of reinforcement learning algorithm. We use the on-policy policy gradient methods both using neural nets for function aproximation and the vanilla algorithm. We then also look at off-policy Q-learning approach. We find that policy gradients seem to perform better, some of our hypotheses of why this might be the case are discussed somewhat more extensivvely in the document.