Skip to content

Random search command line driver to optimize a parameter surface determined by a command

License

Notifications You must be signed in to change notification settings

vietjtnguyen/pyrandomsearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pyrandomsearch

Performs random search on a parameter space where the objective function is a program that outputs a number to stdout. The initial points are read from the input specified by --evaluated-points (default is stdin). At each step the set of existing points is sorted to find the "best" point (see --optimization-type). Exploratory points (see --num-proposals) are then generated by picking an offset vector on a scaled N-dimensional sphere (see --dimensionality and --radii) and applying it to the best point. For each exploratory point the command is run to evaluate that point. Each newly evaluated point is added back to the set of existing points and the optimization repeats until convergence (see --stale-threshold and --stale-count). The value for a point is the first line in the command's stdout that can be converted to a float. This means the string inf and -inf are parsable and can be used to mark impossible parameter points.

Example

Say you have a simulation and analysis test suite that runs a series of simulations for a given configuration and returns a single numeric score for that configuration:

$ ./evaluate_configuration_in_sim --alpha=0.3 --beta=1.0 -C 1000.0 342.4 10
1034.53

You want to find a good configuration but the configuration/parameter space is five dimensional (five numbers to tune). A grid search works nicely for getting a feel for the configuration space but can be expensive if your simulation and analysis suite is expensive to run. Random search is a crude but simple method for exploring a configuration/parameter space.

pyrandomsearch basically is a "driver" that determines new configurations to try and then invokes your command with the configuration. It then reads the score/cost from the command and repeats until progress slows (gets "stale").

Here's an example invocation:

$ echo "1034.53 0.3 1.0 1000.0 342.4 10" \
  | pyrandomsearch \
    --radii=0.1,0.5,500,100,4 \
    --optimization-type=max \
    --num-proposals=10 \
    --print-date-and-time \
    './evaluate_configuration_in_sim --alpha={} --beta={} -C {} {} {}'

A point is the score/cost/evaluation for that point followed by all of the point's components. So in the above example, 1034.53 is the score for point (0.3, 1.0, 1000.0, 342.4, 10). The program's output follows the same format.

If you want a quick example to just run we can use Python to evaluate an offset two dimensional quadratic surface with a global minimum at (4, 4): python3 -c "print(({}-4)**2+({}-4)**2)". You can also give it empty input in order to start the optimization at the origin.

$ echo "" \
  | pyrandomsearch \
    --rng-seed=0 \
    --dimensionality=2 \
    --radii=0.5 \
    --optimization-type=min \
    --num-proposals=4 \
    --stale-count=2 \
    'python3 -c "print(({}-4)**2+({}-4)**2)"'
## WARN: No existing points, seeding with origin
## Existing points:
## New points:
33.33016693325995 0.27953760569721353 -0.41455847235470794
33.84771087498297 -0.43901558517616096 0.23930172580328984
36.523094960394396 -0.4987459069840192 -0.03539096306528034
35.31706172458439 0.10538510265751438 -0.48876781823056353
...
0.11336010432191714 4.301968141413402 4.148913887509703
0.20925802238514438 4.117004666628483 3.5577693697036556
0.24940966182576765 3.8327014026304047 4.470553760099666
0.42210897999430264 3.3673455505322423 4.1478422387646745
## Best point: 0.028137782991007325 3.8348607359667337 3.9705584228418638

If you have existing evaluated points in a file you can specify that file for loading with --input:

$ echo "2 3 3" > points.txt
$ pyrandomsearch \
  --input=points.txt \
  --rng-seed=0 \
  --dimensionality=2 \
  --radii=0.5 \
  --optimization-type=min \
  --num-proposals=4 \
  --stale-count=2 \
  'python3 -c "print(({}-4)**2+({}-4)**2)"'
## Existing points:
2.0 3.0 3.0
## New points:
2.5200417333149883 3.2795376056972136 2.5854415276452922
2.649427718745742 2.560984414823839 3.2393017258032897
3.3182737400985984 2.501254093015981 2.9646090369347196
3.016765431146098 3.1053851026575146 2.5112321817694365
...
1.8470073760058219 2.830020070854084 3.3085129492162997
1.6228828613128918 2.8645426891004027 3.4224019075191237
1.141081146510152 3.749537277804338 2.9615638819356818
1.873808109892183 2.827588030939391 3.2934178853837235
## Best point: 1.0653193465094637 3.3223805124466272 3.221442888031091

You can append newly evaluated points to the input file by using the --append option:

$ echo "2 3 3" > points.txt
$ pyrandomsearch \
  --input=points.txt \
  --append \
  --rng-seed=0 \
  --dimensionality=2 \
  --radii=0.5 \
  --optimization-type=min \
  --num-proposals=4 \
  --stale-count=2 \
  'python3 -c "print(({}-4)**2+({}-4)**2)"'
## Existing points:
2.0 3.0 3.0
## New points:
2.5200417333149883 3.2795376056972136 2.5854415276452922
2.649427718745742 2.560984414823839 3.2393017258032897
3.3182737400985984 2.501254093015981 2.9646090369347196
3.016765431146098 3.1053851026575146 2.5112321817694365
...
1.8470073760058219 2.830020070854084 3.3085129492162997
1.6228828613128918 2.8645426891004027 3.4224019075191237
1.141081146510152 3.749537277804338 2.9615638819356818
1.873808109892183 2.827588030939391 3.2934178853837235
## Best point: 1.0653193465094637 3.3223805124466272 3.221442888031091
$ tail points.txt
3.662986291560146 2.63202730265564 2.661479551564287
1.0653193465094637 3.3223805124466272 3.221442888031091
1.7864612304830279 3.085340872416278 3.0253924324063495
1.6503161905493098 2.716150981909868 3.9547463891129824
1.1367532497363355 3.3029967699834386 3.193191629268253
1.8470073760058219 2.830020070854084 3.3085129492162997
1.6228828613128918 2.8645426891004027 3.4224019075191237
1.141081146510152 3.749537277804338 2.9615638819356818
1.873808109892183 2.827588030939391 3.2934178853837235
## Best point: 1.0653193465094637 3.3223805124466272 3.221442888031091

This lets you use shell scripting to set up an "optimization rate" (i.e. radii) schedule. In the following example we take advantage of the fact that --radii will evaluate its arguments as expressions allowing us to insert math expressions.

$ echo "" > points.txt
$ for X in 0 1 2 3 4; do \
    pyrandomsearch \
      --input=points.txt \
      --append \
      --rng-seed=0 \
      --dimensionality=2 \
      --radii="math.pow(10, -$X)" \
      --optimization-type=min \
      --num-proposals=4 \
      --stale-count=2 \
      'python3 -c "print(({}-4)**2+({}-4)**2)"'; \
  done
## WARN: No existing points, seeding with origin
## Existing points:
## New points:
35.160333866519906 0.5590752113944271 -0.8291169447094159
36.19542174996593 -0.8780311703523219 0.4786034516065797
41.54618992078879 -0.9974918139680384 -0.07078192613056068
39.13412344916878 0.21077020531502877 -0.9775356364611271
...
7.799654190847027e-09 3.999971132019926 4.000083464327214
1.2275783593113444e-08 3.99999678064581 3.9998892506462695
3.6832827976079833e-09 4.000051480672121 4.000032140678207
1.8918456610956505e-08 3.99986410973208 3.999978732849378
## Best point: 1.5620963397323232e-09 3.9999639852946767 3.999983720032549

Installation

The easiest thing to do is to install it via pip from the Python Package Index:

pip install pyrandomsearch

The program can also be installed by installing the Python module from source which will establish the pyrandomsearch entry point:

python setup.py install
pyrandomsearch --help

Alternatively the pyrandomsearch.py is stand alone and can be run directly.

./pyrandomsearch/pyrandomsearch.py --help

About

Random search command line driver to optimize a parameter surface determined by a command

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages