Accel-NASBench: Sustainable Benchmarking for Accelerator-aware NAS

TLDR: We propose a compute-efficient approach to NAS benchmark construction for large-scale datasets using training proxies. Using the searched proxy configuration, we build a surrogate NAS benchmark for the ImageNet2012 dataset on the MnasNet search space (Please see the specification of the search space in Appendix B of the paper and here). We also offer inference throughput surrogates for 6 hardware accelerators: Cloud TPUv2 and TPUv3, A100 and RTX3090 GPUs, Xilinx Ultrascale+ ZCU102 and Versal AI Core VCK190 FPGAs, and latency surrogates for the FPGA platforms. The benchmark allows evaluation without model and dataset proxies and can be utilized for benchmarking discrete NAS optimizers.

The XGB surrogates are available on figshare here. They will be automatically downloaded if you run the example.py file using python3 example.py.

To install the requirements, please clone the repository and run the following command inside the cloned directory

pip3 install -r requirements.txt

Then run the example using

python3 example.py

The example file will download the XGB surrogates to anb_models_0_9 directory in the project directory. The surrogates allow evaluation of accuracy, throughput on the 6 accelerators, and latency on the FPGAs. Search space sample can be manually specified using

from configurationspaces.searchspaces import EfficientNetSS as ss
# Create search space instance
search_space = ss()
# Specify sample instance
test_sample_man = search_space.manual_sample(
    [
        [1, 6, 6, 6, 6, 6, 6],  # Expansion Factor for the 7 blocks
        [3, 3, 5, 3, 5, 5, 3],  # Kernel Sizes
        [1, 2, 2, 3, 3, 4, 1],  # Number of Layers in block
        [True, True, True, True, True, True, True],   # Squeeze-Excite state
    ]
)

or sample 4 random architectures using

test_samples_rand = search_space.random_sample(4)

The accuracy surrogate instance can be created using anb.ANBEnsemble('xgb'). The throughput surrogate instance can be created as follows

ensemble_inst_thr = anb.ANBEnsemble("xgb", device="tpuv2", metric="throughput")

The supported devices and their corresponding metrics are as follows:

supported_metrics = {
            "3090": ["throughput"],
            "a100": ["throughput"],
            "tpuv2": ["throughput"],
            "tpuv3": ["throughput"],
            "zcu102": ["latency", "throughput"],
            "vck190": ["latency", "throughput"],
        }

A possible result of running example.py is follows. The result will be different each run owing to the random sampling of architecture. For the manually specified sample, the results would be the same each run:

Mean Accuracy: [51.687283 65.736916]  # [Acc of sample 1, Acc of sample 2]
Std Acc: [0.13972819 0.1365913 ]
Mean Throughput: [1320.735   883.5296]  # [Throughput of sample 1, of sample 2] in images/sec
Std Thr: [7.4857635 8.001069 ]  # Standard deviation in throughput is measured in images/sec

Since we passed two samples to .query methods, we get their corresponding results in arrays, first element of the array corresponds to the result of the first sample.

Accel-NASBench dataset

Dataset utilized to train the surrogates is provided in json format here similar to that used by NASBench-301. Please see a sample result_x.json file to understand its different fields. Each result_x.json file contains architecture specification, accuracy, train time, and all on-device throughput/latency measurement mean and standard deviations. We train the surrogates using mean throughput/latency values. Accuracy is evaluated only at a single seed.

Fit surrogates using the dataset

Although we provide the surrogate models, they can be trained manually using the dataset provided. Please follow the following steps in order to train the surrogates.

Download and extract the anb_dataset_jsons.tar.gz archive.
Take note of the root directory of the extracted dataset.
Run the following command inside the project directory to fit the accuracy XGB surrogate on random train/val/test splits of 0.8/0.1/0.1 ratio

python3 fit_model.py --dataset_root <path/to/extracted/dataset/> --model xgb --model_config_path ./configs/model_configs/gradient_boosting/xgb_configspace.json --data_config_path configs/data_configs/nb_fpga.json --log_dir experiments/ --seed <seed>

Example result of the above command using seed=3 is as follows but will be different each run:

train metrics: {'mae': 0.20502309085633813, 'mse': 0.06897010112105674, 'rmse': 0.2626215930213217, 'r2': 0.9937321545409954, 'kendall_tau': 0.9485453372279331, 'kendall_tau_2_dec': 0.9490165549382614, 'kendall_tau_1_dec': 0.9521390453390867, 'spearmanr': 0.9963628656408133}
valid metrics: {'mae': 0.32042279419469843, 'mse': 0.19266443301611308, 'rmse': 0.43893556818297724, 'r2': 0.9827909607086215, 'kendall_tau': 0.9246529281557777, 'kendall_tau_2_dec': 0.9252142975144962, 'kendall_tau_1_dec': 0.9283951048238933, 'spearmanr': 0.991347023504538}
test metrics {'mae': 0.3169779104452133, 'mse': 0.16787281044461427, 'rmse': 0.4097228458905047, 'r2': 0.9839661171659461, 'kendall_tau': 0.9170514701750982, 'kendall_tau_2_dec': 0.9175674110264382, 'kendall_tau_1_dec': 0.9204615872718475, 'spearmanr': 0.98994312462432}

To fit the throughput XGB surrogate for ZCU102 FPGA on random train/val/test splits:

python3 fit_model.py --dataset_root <path/to/extracted/dataset/> --model xgb_accel --device zcu102 --metric throughput --model_config_path ./configs/model_configs/gradient_boosting/xgb_accel_zcu102_throughput_configspace.json --data_config_path configs/data_configs/nb_fpga.json --log_dir experiments/ --seed <seed>

When fitting surrogates for throughput/latency, use models <xgb/lgb/sklearn_forest/svr/svr_nu>_accel, combined with --device <zcu102/vck190/tpuv2/tpuv3/a100/3090> and --metric <throughput/latency>. Throughput is supported by all 6 devices while latency is supported by only the FPGAs. The relevant model config files for model/device/metric are located here.

The dataset splits utilized in this work were generated using the create_data_splits.py file. The splits are located in configs directory here. Please place the dataset inside a directory structure specified inside the splits json files when training on the manual splits rather than random splits.

Hyperparameter Optimization

The hyperparameters of the surrogates were optimized using SMAC3. Plase see do_hpo.py and shell/hpo_all.sh for details and file an issue if face an issue trying to perform HPO.

The searched hyperparameters for various device/metric pairs can be found here.

ANB Evaluation

Please see anb-eval

Dataset Collection Pipelines

Owing to the complex instrumentation of dataset collection, we have an entire repository that details collection pipelines for accuracy, throughput, and latency. Please see ANB-DatasetCollection. Please note that collection of throughput/latency requires specialized hardware such as TPUs and FPGAs.

Acknowledgements

This project was supported by Cloud TPUs from Google's TPU Research Cloud (TRC) program. GPU compute supported by CloudLabs and Turing AI Compute Cluster @ HKUST.

This repository builds upon code from the following repositories:

We are grateful to the authors of these repositories for their contributions.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
anb_dataset		anb_dataset
anb_eval		anb_eval
configs		configs
configurationspaces		configurationspaces
plots/ablations_plots		plots/ablations_plots
shell		shell
surrogate_models		surrogate_models
LICENSE		LICENSE
MAINTENANCE.md		MAINTENANCE.md
README.md		README.md
__init__.py		__init__.py
accelnb.py		accelnb.py
create_data_splits.py		create_data_splits.py
do_hpo.py		do_hpo.py
ensemble.py		ensemble.py
eval_ensemble.py		eval_ensemble.py
example.py		example.py
fit_model.py		fit_model.py
model_downloader.py		model_downloader.py
requirements.txt		requirements.txt
surrogate_model.py		surrogate_model.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Accel-NASBench: Sustainable Benchmarking for Accelerator-aware NAS

Accel-NASBench dataset

Fit surrogates using the dataset

Hyperparameter Optimization

ANB Evaluation

Dataset Collection Pipelines

Acknowledgements

About

Releases

Packages

Languages

License

afzalxo/Accel-NASBench

Folders and files

Latest commit

History

Repository files navigation

Accel-NASBench: Sustainable Benchmarking for Accelerator-aware NAS

Accel-NASBench dataset

Fit surrogates using the dataset

Hyperparameter Optimization

ANB Evaluation

Dataset Collection Pipelines

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages