Deep Learning Benchmark Script Driver

This repo holds the benchmark tasks that are executed regularly and also the driver script. It is designed to be simple and flexible. Basically it is a thin wrapper on top of your benchmark script which provides basic utilities for managing tasks, profiling gpu and cpu memories, and extract metrics from your logging.

To use the benchmark driver, you just need to look at the task_config_template.cfg to see what benchmark tasks are supported and use the command python benchmark_driver.py --task-name [TASK_NAME] --num-gpus [NUM_GPUS] then the driver will trigger a task and write the benchmark result to a JSON file called "dlbenchmark_result.json". e.g.

{
    'resnet50_cifar10_hybrid.total_training_time': 797.3191379999998,
    'resnet50_cifar10_hybrid.gpu_memory_usage_max': 749.0,
    'resnet50_cifar10_hybrid.cpu_memory_usage': 788,
    'resnet50_cifar10_hybrid.validation_acc': 0.560998,
    'resnet50_cifar10_hybrid.speed': 1268.8090892,
    'resnet50_cifar10_hybrid.gpu_memory_usage_std': 66.27262632335068,
    'resnet50_cifar10_hybrid.training_acc': 0.751256,
    'resnet50_cifar10_hybrid.gpu_memory_usage_mean': 602.6772235225278
}

There are also optional flags such as --framework, and --metrics-suffix, which are used for decorating the metrics names.

To add a new task you just need to put your benchmark script or your benchmark script directory under this repo and add a section in the task_config_template.cfg like the following example:

 [resnet50_cifar10_imperative]
 patterns = ['Speed: (\d+\.\d+|\d+) samples/sec', 'training: accuracy=(\d+\.\d+|\d+)', 'validation: accuracy=(\d+\.\d+|\d+)', 'time cost: (\d+\.\d+|\d+)']
 metrics = ['speed', 'training_acc', 'validation_acc', 'total_training_time']
 compute_method = ['average', 'last', 'last', 'total']
 command_to_execute = python image_classification/image_classification.py --model resnet50_v1 --dataset cifar10 --gpus 8 --epochs 20 --log-interval 50
 num_gpus = 8

patterns, metrics, compute_method need to be placed in corresponding order so that the metrics map knows the key pair relationship. Users need to defined the logging extraction rule using python's regular expression.

Typical logging looks like the following example:

INFO:root:Epoch[0] Batch [49]	Speed: 829.012392 samples/sec	accuracy=0.172578
INFO:root:Epoch[0] Batch [99]	Speed: 844.157227 samples/sec	accuracy=0.206563
INFO:root:Epoch[0] Batch [149]	Speed: 835.582445 samples/sec	accuracy=0.230781
INFO:root:[Epoch 0] training: accuracy=0.248027
INFO:root:[Epoch 0] time cost: 69.030488
INFO:root:[Epoch 0] validation: accuracy=0.296484
INFO:root:Epoch[1] Batch [49]	Speed: 797.687061 samples/sec	accuracy=0.322344
INFO:root:Epoch[1] Batch [99]	Speed: 821.444257 samples/sec	accuracy=0.329883
INFO:root:Epoch[1] Batch [149]	Speed: 810.339386 samples/sec	accuracy=0.342969
INFO:root:[Epoch 1] training: accuracy=0.351983
INFO:root:[Epoch 1] time cost: 61.266612
INFO:root:[Epoch 1] validation: accuracy=0.393930

Here you can see the pattern Speed: (\d+\.\d+|\d+) samples/sec correspond to Speed: 829.012392 samples/sec, Speed: 844.157227 samples/sec, Speed: 835.582445 samples/sec, ... So the driver will extract these parts into a list and map a number extractor to this list. So in the end we will get metric=[829.012392, 844.157227, 835.582445,...]. The compute_method is average, suppose average([829.012392, 844.157227, 835.582445,...])=825.25 ,so it will put the pair speed: 825.25 in the final result file.

The driver will redirect the logging into a logfile and will remove it after the metrics have been successfully extracted.

Name		Name	Last commit message	Last commit date
Latest commit History 175 Commits
chainer_examples/imagenet		chainer_examples/imagenet
conda_dlami_testing		conda_dlami_testing
dawnbench		dawnbench
image_classification		image_classification
onnx_benchmark		onnx_benchmark
pytorch_examples/imagenet		pytorch_examples/imagenet
sagemaker_testing		sagemaker_testing
tensorflow_benchmark/tf_cnn_benchmarks		tensorflow_benchmark/tf_cnn_benchmarks
test		test
utils		utils
word_language_model		word_language_model
README.md		README.md
benchmark_driver.py		benchmark_driver.py
benchmark_runner.py		benchmark_runner.py
requirement.txt		requirement.txt
task_config_template.cfg		task_config_template.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Learning Benchmark Script Driver

About

Releases

Packages

Languages

harshp8l/deep-learning-benchmark-mirror

Folders and files

Latest commit

History

Repository files navigation

Deep Learning Benchmark Script Driver

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages