This repository implements the Proximal Policy Option-Critic algorithm. It is based on the baselines from OpenAI. In order to be able to run the code, you should first install it directly from the Baselines repository (commit id d8cce2309f3765bf55c46e4ffe4722406f412275). After that, you simply have to replace the some of the files contained in the ppo1 folder on your machine with the ones in this repository.
To train a model and save the results:
python run_mujoco.py --saves --wsaves --opt 2 --env Walker2d-v1 --seed 777 --app savename --dc 0.1
where the most important parameter is "dc", the deliberation cost.
It is possible to view some of our results in this video.