Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extension projects to enable collaborative benchmarking, design space exploration and optimization of ML and AI Systems #627

Open
gfursin opened this issue Jan 17, 2023 · 2 comments

Comments

@gfursin
Copy link
Contributor

gfursin commented Jan 17, 2023

No description provided.

@gfursin
Copy link
Contributor Author

gfursin commented Jan 17, 2023

  • Test that MLPerf inference benchmark works using the CM interface across different implementations, models, frameworks, data sets and platforms. See the CM interface with available variations and flags here.

    • Increase the coverage of the reference implementation of the MLPerf inference benchmark. See the current status here.
    • Add more ML models (Hugging Face, Nvidia benchmark, papers) compatible with the MLPerf inference benchmark as CM scripts, test the benchmark and add models as variations here.
    • Test TFLite C++ implementation of the MLPerf inference benchmark and extend coverage to other models and platforms. Attempt to reproduce past MLperf submissions with this new interface: (data center and edge)
      • Extend MLPerf CM script to reproduce past open division submissions from MLPerf inference v2.1 (data center and edge) and prepare new submissions using the latest versions of all dependencies
    • Test C++ implenentation of the MLPerf inference benchmark and increase the coverage
    • Test DeepSparse implementation
    • Compare different implementations and attempt to optimize them using different batch sizes, thread numbers, etc
    • Participate in MLPerf inference v3.0 submission
    • Improve tutorials and documentation to make it easier for the community to understand CK2 (CM) concepts, use CM interface for unified ML Systems benchmarking and extend CM automations!
  • Some performance/accuracy experiments per model

    • ResNet50
      • Integrate tflite cpp code to generic-cpp code, make all scenarios run (currently only singlestream works)
      • Try cuda device for tflite
      • Add new models which can work with imagenet dataset to open division
      • Check performance of quantized models on different backends
      • Compare performance of reference and Nvidia implementation on GPUs
      • Compare performance of reference and Intel implementation on CPUs
      • Does TVM improve performance of any model/system.scenario?
    • Bert
      • Try different bert models trained on squad
      • Try quantized models on different backends
      • Compare performance of reference and Nvidia implementation on GPUs
      • Compare performance of reference and Intel implementation on CPUs
      • Does TVM improve performance of any model/system.scenario?
    • RetinaNet
      • Dry different retinanet models trained on openimages
      • Try quantized models on different backends
      • Try running the NMS part on CPU and rest on GPU
      • Compare performance of reference and Nvidia implementation on GPUs
      • Compare performance of reference and Intel implementation on CPUs
      • Does TVM improve performance of any model/system.scenario?
  • Test/improve the CM script for the light MLPerf inference benchmark to benchmark and optimize any ONNX model with loadgen but without data sets and accuracy!

  • Test/improve CM interface to automatically prepare and run TinyMLPerf and prepare official tutorial. Use OctoML's submission as a starting point: results, code

    • Try another device such as Arduino Nano 33 BLE sense if supported in MLPerf
    • Participate in TinyMLPerf inference v3.0 submission
  • Test and improve individual CM scripts across different software/stacks to be reused in any R&D project - will be useful for our reproducibility initiatives and artifact evaluation at conferences

@arjunsuresh
Copy link
Contributor

arjunsuresh commented Feb 1, 2023

Reference for adding CUDA for tflite-cpp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants