Skip to content

Parallelized Cross Entropy Method

Notifications You must be signed in to change notification settings

maxhuettenrauch/cem

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cross Entropy Method

Cross Entropy Method (CEM) is a gradient free optimization algorithm that fits parameters by iteratively resampling from an elite population. The model learns only from a single scalar (total episode reward).

Usage

$ python cem.py cartpole --num_process 6 --epochs 8 --batch_size 4096

$ python cem.py pendulum --num_process 6 --epochs 15 --batch_size 4096

Pseudocode

for epoch in num_epochs:
  sampling a population from a distribution
  testing that population using the environment
  selecting the elites (judged by total episode reward)
  refitting the sampling distribution (to the elites)

Parallelization

CEM is easily parallelizable - this code base runs large batches across multiple processes using Python's multiprocessing, making it very efficient in wall time.

The total number of episodes run in an experiment is given by:

num_episoes = num_epochs * num_processes * batch_size

Setup

$ git clone https://github.com/ADGEfficiency/cem

$ cd cem

$ pip install -r requirements.txt

The two dependencies of this project are Open AI gym and matplotlib.

Results

Results for the OpenAI gym environments

CartPole-v0

Pendulum-v0

About

Parallelized Cross Entropy Method

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%