Skip to content

Latest commit

 

History

History
13 lines (11 loc) · 614 Bytes

README.md

File metadata and controls

13 lines (11 loc) · 614 Bytes

Overview

Reinforcement Learning agents for playing flappy-bird game

PPO (Proximal Policy Optimization)

  • Deep reinforcement learning using a PPO model written in Tensorflow.
  • The Flappy Bird Gym environment was mastered in 9 hours of training using a CPU with 8 cores and a GPU. Training and evaluation code is available at dppo.ipynb
  • Getting all 264.0 over 200 consecutive trials.

Requirements