Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap for CM, MLPerf and ML/SW/HW DSE: 20221116 #536

Closed
61 of 96 tasks
gfursin opened this issue Nov 16, 2022 · 1 comment
Closed
61 of 96 tasks

Roadmap for CM, MLPerf and ML/SW/HW DSE: 20221116 #536

gfursin opened this issue Nov 16, 2022 · 1 comment

Comments

@gfursin
Copy link
Contributor

gfursin commented Nov 16, 2022

Following the successful testing of CM end-to-end benchmarking and submission workflow for modular MLPerf benchmarks at the Student Cluster Competition at SuperComputing'22, we have prepared a new list of pending tasks for the MLCommons taskforce on education and reproducibility. The goal is to help the community automate their MLPerf submissions for MLPerf v3.0 and continue modularizing ML Systems and automating their benchmarking, optimization and design space exploration:

Community discussions (see the notes from weekly conf-calls)

  • Make it easier for users to debug CM scripts #581
  • Discuss how to automate iterative/autotuning experiments and collaborative design space exploration using CM meta-framework
    • Discuss how to record all the provenance during experiments (dependencies and their versions)
    • Discuss how to visualize all past MLPerf results as well as results from CM experiments during optimization
    • Discuss how to report a table of tested/working/failed combinations of ML tasks,models,engines,datasets, OS, CPU and other deps
    • Discuss how to reproduce best performance/accuracy results from closed/open submissions via CM
    • Discuss how to create a universal performance benchmark with CM and loadgen to plug in ANY model (without accuracy - sync with Guenther)
  • (GF) Recreate/reuse CK mailing list for the taskforce (as suggested by users)
  • (GF) Provide update about CM automation, SCC experience and the next steps (DSE) to MLPerf inference WG and our taskforce
  • (GF) Prepare CM automation presentation for MedPerf WG (20221212)
  • Discuss universal, modular, portable and reproducible benchmarking (interest from MLCommons mobile and general community)
    • Discuss universal benchmarking with mobile MLPerf WG
    • Add DSE and NAS to CM MLPerf workflow

Finish testing our end-to-end CM MLPerf submission workflow (small dataset)

RetinaNet

ResNet50

  • (GF) Ref Python MLPerf with ResNet50 FP32, ONNX and CPU (test and document - check GitHub action)
  • (AS) Ref Python MLPerf with ResNet50 FP32, TVM and CPU (test and document - build stable TVM)
  • Ref Python MLPerf with ResNet50 FP32, ONNX and CUDA
  • Ref Python MLPerf with ResNet50 FP32, PyTorch and CPU
  • Ref Python MLPerf with ResNet50 FP32, TF and CPU
  • C++ MLPerf with ResNet50 FP32, ONNX and CPU
  • C++ MLPerf with ResNet50 FP32, ONNX and CUDA
  • C++ MLPerf with ResNet50 Int8, ONNX and CUDA
  • C++ MLPerf with ResNet50 FP32, PyTorch and CUDA
  • C++ MLPerf with ResNet50 Int8, PyTorch and CUDA

Compare C++ implementation with best performance (need to validate):

  • INT8: offline: 40000 images/sec -> CPU
  • INT8: offline: 16000 images/sec -> CPU 96-core PyTorch
  • FP32: offline: 40 images/sec -> CPU 2..4 cores OnnxRuntime

BERT

  • Ref Python MLPerf with BERT FP32, ONNX and CPU
  • Ref Python MLPerf with BERT FP32, Tensorflow and CPU
  • Ref Python MLPerf with BERT FP32, Pytorch and CPU
  • Ref Python MLPerf with BERT INT8, ONNX and CPU
  • Ref Python MLPerf with BERT FP32, ONNX and CUDA
  • Ref Python MLPerf with BERT FP32, Tensorflow and CUDA
  • Ref Python MLPerf with BERT FP32, Pytorch and CUDA
  • Ref Python MLPerf with BERT INT8, ONNX and CUDA

All other reference MLPerf implementations

Test and document how to run and tune other MLPerf scenarios

  • SingleStream
  • MultipleStream
  • Server

Add Power measurements to the CM MLPerf workflow

Finish testing our end-to-end MLPerf submission workflow (full dataset)

  • Python MLPerf with RetinaNet FP32, ONNX and CPU (test and document)
  • Python MLPerf with ResNet50 FP32, ONNX and CPU (test and document)
  • Python MLPerf with BERT FP32, ONNX and CPU (test and document)

Design Space Exploration and testing

  • Automate exploration and testing of all design choices of ML Systems using CM and MLPerf with the help of the community. Record all dependency versions for the dashboard including versions of engines, compilers, pytrochvision, etc ..

Misc

Documentation

  • Remove outdated CK notes from MLPerf inference repository
  • Update READMEs for app-mlperf-inference, app-mlperf-inference-cpp, run-mlperf-inference-app (including tutorial for SCC'22); add API with all the optimizaiton/DSE dimensions!
  • Add links to above READMEs to the MLPerf inference repository
  • Add extension projects (including for students)
  • Update main CM documentation:
  • (GF+AS) Create MLPerf inference SCC end-to-end MLPerf video tutorial
  • (GF+AS) Video tutorial about CM

Add non-reference (optimized) implementations

  • NVidia MLPerf with RetinaNet FP32, ONNX and CPU (test and document)
  • NVidia MLPerf with ResNet50 FP32, ONNX and CPU (test and document)
  • NVidia MLPerf with BERT FP32, ONNX and CPU (test and document)
  • TFLite with MobileNets (reproduce open division submissions using CM)
  • NeuralMagic implementation with pruning (arrange a hackathon)
  • Qualcomm AI100 implementation with quantization
  • Intel implementation

Improve testing and documentation of individual CM scripts:

  • automatically generate README.md from meta and "docs" directory with manually prepared READMEs?
  • Add tests/matrix.yaml for CMD tests?
  • Add stable dockerfiles ?

Add support for Android

  • CM script to detect Android SDK
  • CM script to detect Android NDK
  • CM script to build and run simple app on Android (image corner detection)
  • Discuss universal benchmarking with mobile MLPerf WG

Enhancement projects (ideas)

  • Work with the community to reproduce MLPerf inference v2.1 submissions, modularize them using CM, add them to our universal benchmarking workflow with a modular Docker container and automate submission of Pareto-optimal results to MLPerf inference v3.0
  • Add CM support for Android benchmarking (collaborate with MLPerf mobile WG)
  • C++ MLPerf with RetinaNet FP32, PyTorch and CPU (add backend and optimize)
  • C++ MLPerf with RetinaNet FP32, TVM and CPU (add backend and optimize)
  • Ref Python MLPerf with RetinaNet FP32, TVM and CPU (optimize)
  • Create fun app with web cam and object detection with MLPerf RetinaNet and CM
  • Discuss projects with VJ and MLPerf research WG

Upcoming presentations

  • MLCommons MedPerf WG
@ctuning-admin
Copy link
Member

Will prepare a new plan based on our resources and bandwidth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants