Skip to content
/ gymtime Public

Deep RL algorithms for OpenAI gym's environments

Notifications You must be signed in to change notification settings

heerad/gymtime

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gymtime

Deep RL algorithms for OpenAI gym's environments

See Heerad's submissions here

Implemented:

  • Actor-critic with per-step updates using eligibiilty traces
  • Deep Q-learning (DQN) with experience replay to improve sample efficiency
  • DDPG for continuous action spaces
  • UCB exploration based on Hoeffding's inequality as an alternative to epsilon-greedy exploration for DQN
  • Double Q-learning for eliminating maximization bias from applying function approximators to Q-learning
  • Prioritized experience replay for DQN
  • Slowly-updating target network (used in computing TD error) for stability
  • Norm clipping for stability

TODO:

  • Atari environments via convnets
  • PPO

About

Deep RL algorithms for OpenAI gym's environments

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages