Roadmap for CM, MLPerf and ML/SW/HW DSE: 20221116 #536

gfursin · 2022-11-16T09:08:48Z

Following the successful testing of CM end-to-end benchmarking and submission workflow for modular MLPerf benchmarks at the Student Cluster Competition at SuperComputing'22, we have prepared a new list of pending tasks for the MLCommons taskforce on education and reproducibility. The goal is to help the community automate their MLPerf submissions for MLPerf v3.0 and continue modularizing ML Systems and automating their benchmarking, optimization and design space exploration:

Community discussions (see the notes from weekly conf-calls)

Finish testing our end-to-end CM MLPerf submission workflow (small dataset)

RetinaNet

(GF) C++ MLPerf with RetinaNet FP32, ONNX and CPU (test and document)
(GF) Ref Python MLPerf with RetinaNet FP32, ONNX and CUDA (check that works with both CPU and GPU)
(AS) Ref Python MLPerf with RetinaNet FP32, PyTorch and CPU (should work with older torchvision and num_threads=1)
(GF) C++ MLPerf with RetinaNet FP32, ONNX and CUDA (test and document)
Update tutorial and add a stable hash
Update run-mlperf-inference-app README
Update modular Docker with the above app
Create a reproducibility matrix with all tested or failed choices (see https://github.com/mlcommons/ck/tree/master/cm-mlops/script/run-mlperf-inference-app)
Add meta information about RetinaNet to _cm.json #563

ResNet50

Compare C++ implementation with best performance (need to validate):

INT8: offline: 40000 images/sec -> CPU
INT8: offline: 16000 images/sec -> CPU 96-core PyTorch
FP32: offline: 40 images/sec -> CPU 2..4 cores OnnxRuntime

BERT

Ref Python MLPerf with BERT FP32, ONNX and CPU
Ref Python MLPerf with BERT FP32, Tensorflow and CPU
Ref Python MLPerf with BERT FP32, Pytorch and CPU
Ref Python MLPerf with BERT INT8, ONNX and CPU
Ref Python MLPerf with BERT FP32, ONNX and CUDA
Ref Python MLPerf with BERT FP32, Tensorflow and CUDA
Ref Python MLPerf with BERT FP32, Pytorch and CUDA
Ref Python MLPerf with BERT INT8, ONNX and CUDA

All other reference MLPerf implementations

Add to CM all other rerference MLPerf implementations (maybe with the help of the community and students)
Add tests of all reference MLPerf implementations to the MLPerf inference repo: https://github.com/mlcommons/inference/tree/master/.github/workflows
Add modular Docker containers for all reference MLPerf implementations:
- Prototype1
- Prototype 2

Test and document how to run and tune other MLPerf scenarios

SingleStream
MultipleStream
Server

Add Power measurements to the CM MLPerf workflow

Configure power meter #559
Document how to measure power
Turn on power daemon, collect results and unify output in the CM workflow

Finish testing our end-to-end MLPerf submission workflow (full dataset)

Python MLPerf with RetinaNet FP32, ONNX and CPU (test and document)
Python MLPerf with ResNet50 FP32, ONNX and CPU (test and document)
Python MLPerf with BERT FP32, ONNX and CPU (test and document)

Design Space Exploration and testing

Automate exploration and testing of all design choices of ML Systems using CM and MLPerf with the help of the community. Record all dependency versions for the dashboard including versions of engines, compilers, pytrochvision, etc ..

Misc

(AS) clean MLPerf inference results and submission directory on --clean #537
Remove unnecessary/outdated CM scripts: https://github.com/mlcommons/ck/tree/master/cm-mlops/script #555
Add new Docker containers for all MLPerf inference examples from SCC tutorial
(GF) Check get-tvm building with CUDA and test with image classification example
(AS) Check how to sign MLCommons power agreement and access power repo

Documentation

Add non-reference (optimized) implementations

NVidia MLPerf with RetinaNet FP32, ONNX and CPU (test and document)
- NVIDIA Retinanet benchmark setup fails inference_results_v2.1#6
- sync with Ethan Cheng about modularization
NVidia MLPerf with ResNet50 FP32, ONNX and CPU (test and document)
NVidia MLPerf with BERT FP32, ONNX and CPU (test and document)
TFLite with MobileNets (reproduce open division submissions using CM)
NeuralMagic implementation with pruning (arrange a hackathon)
Qualcomm AI100 implementation with quantization
Intel implementation

Improve testing and documentation of individual CM scripts:

automatically generate README.md from meta and "docs" directory with manually prepared READMEs?
Add tests/matrix.yaml for CMD tests?
Add stable dockerfiles ?

Add support for Android

CM script to detect Android SDK
CM script to detect Android NDK
CM script to build and run simple app on Android (image corner detection)
Discuss universal benchmarking with mobile MLPerf WG

Enhancement projects (ideas)

Work with the community to reproduce MLPerf inference v2.1 submissions, modularize them using CM, add them to our universal benchmarking workflow with a modular Docker container and automate submission of Pareto-optimal results to MLPerf inference v3.0
Add CM support for Android benchmarking (collaborate with MLPerf mobile WG)
C++ MLPerf with RetinaNet FP32, PyTorch and CPU (add backend and optimize)
C++ MLPerf with RetinaNet FP32, TVM and CPU (add backend and optimize)
Ref Python MLPerf with RetinaNet FP32, TVM and CPU (optimize)
Create fun app with web cam and object detection with MLPerf RetinaNet and CM
Discuss projects with VJ and MLPerf research WG

Upcoming presentations

MLCommons MedPerf WG

ctuning-admin · 2024-04-23T18:04:38Z

Will prepare a new plan based on our resources and bandwidth.

gfursin assigned arjunsuresh and gfursin Nov 16, 2022

gfursin added enhancement cm-mlperf cm-documentation cm-mlperf-reproducibility-matrix labels Nov 16, 2022

gfursin mentioned this issue Nov 16, 2022

Finish testing, documenting and dockerizing MLPerf inference RetinaNet [python & cpp] #460

Closed

17 tasks

gfursin changed the title ~~Roadmap for CM and MLPerf: 20221116~~ Roadmap for CM, MLPerf and ML/SW/HW DSE: 20221116 Nov 17, 2022

gfursin assigned TheKanter and petermattson Nov 17, 2022

ctuning-admin closed this as completed Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roadmap for CM, MLPerf and ML/SW/HW DSE: 20221116 #536

Roadmap for CM, MLPerf and ML/SW/HW DSE: 20221116 #536

gfursin commented Nov 16, 2022 •

edited

Loading

ctuning-admin commented Apr 23, 2024

Roadmap for CM, MLPerf and ML/SW/HW DSE: 20221116 #536

Roadmap for CM, MLPerf and ML/SW/HW DSE: 20221116 #536

Comments

gfursin commented Nov 16, 2022 • edited Loading

Community discussions (see the notes from weekly conf-calls)

Finish testing our end-to-end CM MLPerf submission workflow (small dataset)

RetinaNet

ResNet50

BERT

All other reference MLPerf implementations

Test and document how to run and tune other MLPerf scenarios

Add Power measurements to the CM MLPerf workflow

Finish testing our end-to-end MLPerf submission workflow (full dataset)

Design Space Exploration and testing

Misc

Documentation

Add non-reference (optimized) implementations

Improve testing and documentation of individual CM scripts:

Add support for Android

Enhancement projects (ideas)

Upcoming presentations

ctuning-admin commented Apr 23, 2024

gfursin commented Nov 16, 2022 •

edited

Loading