improving docs (#1067)

mlcommons · Jan 26, 2024 · 1a05d3c · 1a05d3c
2 parents ad7ddcd + d219cb1
commit 1a05d3c
Show file tree

Hide file tree

Showing 34 changed files with 1,032 additions and 378 deletions.
diff --git a/README.md b/README.md
@@ -9,21 +9,11 @@
 
 ### About
 
-Collective Mind (CM) is a lightweight, non-intrusive and technology-agnostic workflow automation framework 
-to run and manage AI/ML benchmarks, applications and research projects in a unified and fully automated way
-on any platform with any software stack using a common, simple and human-readable interface.
-
-CM is being developed by the [MLCommons Task Force on Automation and Reproducibility](https://github.com/mlcommons/ck/blob/master/docs/taskforce.md)
-based on the feedback from the [research community](https://www.youtube.com/watch?v=7zpeIVwICa4), Google, AMD, Neural Magic, OctoML, Nvidia, Qualcomm, Dell, HPE, Red Hat,
-Intel, TTA, One Stop Systems, ACM and [other organizations and individual contributors](https://github.com/mlcommons/ck/blob/master/CONTRIBUTING.md).
-
-The goal is to help the community gradually convert all ad-hoc README files, papers, reports, Juputer notebooks and containers 
-into [portable and reusable automation recipes (CM scripts)](https://github.com/mlcommons/ck/blob/master/cm-mlops/script)
-that find and call existing scripts and tools via English-like language based on tags 
-and glue them together via extensible JSON/YAML meta descriptions and simple Python logic.
-
-For example, the following CM commands prepare and run image classification 
-with ONNX on any platform with Linux, Windows and MacOS either natively or inside a containter:
+Collective Mind (CM) is a human-friendly interface
+to run a growing number of ad-hoc MLPerf, MLOps, and DevOps scripts
+from MLCommons projects and research papers
+in a unified way on any operating system with any software and hardware 
+as [portable, reusable and extensible automation recipes (CM scripts)](https://github.com/mlcommons/ck/tree/master/cm-mlops/script):
 
 ```bash
 pip install cmind
@@ -32,47 +22,67 @@ cm pull repo mlcommons@ck
 
 cm run script "python app image-classification onnx"
 
-cmr "download file _wget" --url=https://cKnowledge.org/ai/data/computer_mouse.jpg --verify=no --env.CM_DOWNLOAD_CHECKSUM=45ae5c940233892c2f860efdf0b66e7e
-cmr "python app image-classification onnx" --input=computer_mouse.jpg -j
+cm run script "download file _wget" --url=https://cKnowledge.org/ai/data/computer_mouse.jpg --verify=no --env.CM_DOWNLOAD_CHECKSUM=45ae5c940233892c2f860efdf0b66e7e
+
+cm run script "python app image-classification onnx" --input=computer_mouse.jpg
 
 cm docker script "python app image-classification onnx" --input=computer_mouse.jpg
 cm docker script "python app image-classification onnx" --input=computer_mouse.jpg -j -docker_it
 
-cmr "get coco dataset _val _2014"
-cmr "get ml-model stable-diffusion"
-cm show cache
+cm run script "get generic-python-lib _package.onnxruntime"
+cm run script "get coco dataset _val _2014"
+cm run script "get ml-model stable-diffusion"
+cm run script "get ml-model huggingface zoo _model-stub.alpindale/Llama-2-13b-ONNX" --model_filename=FP32/LlamaV2_13B_float32.onnx --skip_cache
 
-cmr "get ml-model huggingface zoo _model-stub.alpindale/Llama-2-13b-ONNX" --model_filename=FP32/LlamaV2_13B_float32.onnx
 cm show cache
+cm show cache "get ml-model stable-diffusion"
 
-cmr "get ml-model huggingface zoo _model-stub.alpindale/Llama-2-13b-ONNX" --model_filename=FP32/LlamaV2_13B_float32.onnx --skip_cache
+cm run script "run common mlperf inference" --implementation=nvidia --model=bert-99 --category=datacenter --division=closed
+cm find script "run common mlperf inference"
 
+cm pull repo ctuning@cm-reproduce-research-projects
+cmr "reproduce paper micro-2023 victima _install_deps"
+cmr "reproduce paper micro-2023 victima _run" 
 
-```
+...
 
-*Note that `cmr` is a shortcut for `cm run script`.*
+```
 
-You can also run all above CM commands via a simple Python API with JSON input/output:
 ```python
 import cmind
-
 output=cmind.access({'action':'run', 'automation':'script',
                      'tags':'python,app,image-classification,onnx',
                      'input':'computer_mouse.jpg'})
 if output['return']==0: print (output)
 ```
 
-Such approach requires minimal learning curve and minimal or no changes to existing projects while helping 
-to dramatically reduce time to understand how to run and customize numerous AI/ML projects 
-across diverse and continuously changing models, datasets, software and hardware from different vendors.
 
-It also helps to gradually abstract, automate, unify and reuse all manual, tedious and repetitive MLOps and DevOps tasks
-including *downloading artifacts, installing tools, substituting paths, updating environment variables, preparing run-time
-environments, generating command lines, processing logs and sharing results*: see the 
-[catalog of automation recipes shared by MLCommons](docs/list_of_scripts.md).
+Collective Mind is a community project being developed by the [MLCommons Task Force on Automation and Reproducibility](https://github.com/mlcommons/ck/blob/master/docs/taskforce.md)
+with great help from [MLCommons (70+ AI organizations)](https://mlcommons.org/),
+[research community]( https://www.youtube.com/watch?v=7zpeIVwICa4 )
+and [individual contributors](https://github.com/mlcommons/ck/blob/master/CONTRIBUTING.md) -
+we want to have a common, non-intrusive, technology-agnostic, portable and easily-extensible interface 
+that requires minimal learning curve to start automating all manual and repetitive tasks including 
+downloading artifacts, installing tools, resolving dependencies, 
+running experiments, processing logs, and reproducing results.
+
+That is why we implemented CM as a [small Python library](https://github.com/mlcommons/ck/tree/master/cm) 
+with minimal dependencies (Python 3.7+, git, wget), simple Python API and human-friendly command line
+that simply searches for CM scripts by tags in all pulled Git repositories, automatically generates command lines 
+for a given script or tool on a given platform, updates all paths and environment variables, 
+runs a given automation either natively or inside automatically-generated containers
+and unifies input and output as a Python dictionary or JSON/YAML file.
+
+Our goal is to make it easier to prototype, build, run, benchmark, optimize and manage complex AI/ML applications
+across diverse and rapidly evolving models, data sets, software and hardware simply by chaining these 
+unified CM scripts into [portable, human-readable and reusable workflows](https://github.com/mlcommons/ck/blob/master/cm-mlops/script/app-image-classification-onnx-py/_cm.yaml).
 
-Please check this [Getting Started tutorial](docs/getting-started.md) to understand
-how CM works and start using it.
+Please check this [Getting Started tutorial](docs/getting-started.md) 
+to understand how CM works and start using it.
+
+Feel free to join [public Discord server](https://discord.gg/JjWNWXKxwT) 
+if you would like to participate in collaborative developments
+or have questions and suggestions.
 
 ### Documentation
 
@@ -86,8 +96,9 @@ how CM works and start using it.
 
 ### Motivation and concepts
 
+* ACM REP'23 keynote about MLCommons CM: [slides](https://doi.org/10.5281/zenodo.8105339)
+* ACM TechTalk'21 about automating research projects: [YouTube](https://www.youtube.com/watch?v=7zpeIVwICa4)
 * MLPerf inference submitter orientation: [slides](https://doi.org/10.5281/zenodo.8144274) 
-* ACM REP'23 keynote about CM concepts: [slides](https://doi.org/10.5281/zenodo.8105339)
 
 ### Copyright
 
@@ -96,9 +107,3 @@ how CM works and start using it.
 ### License
 
 [Apache 2.0](LICENSE.md)
-
-### Discord server
-
-This project is under heavy development based on user feedback - 
-feel free to get in touch via [public Discord server](https://discord.gg/JjWNWXKxwT) 
-if you have questions, suggestions and feature requests.
diff --git a/cm-mlops/automation/script/template_list_of_scripts.md b/cm-mlops/automation/script/template_list_of_scripts.md
@@ -4,11 +4,13 @@
 This file is generated automatically - don't edit!
 -->
 
-This is an automatically generated list of reusable CM scripts being developed
-by the [open taskforce on automation and reproducibility](../../../docs/taskforce.md) 
-to make MLOps and DevOps tools more interoperable, portable, deterministic and reproducible.
-These scripts suppport the community effort to modularize ML Systems and automate their bechmarking, optimization,
-design space exploration and deployment across continuously changing software and hardware. 
+This is an automatically generated list of portable and reusable automation recipes 
+for MLPerf, MLOps and Devops ([CM scripts](https://github.com/mlcommons/ck/tree/master/cm-mlops/script)) 
+with a [common CM interface](https://github.com/mlcommons/ck) 
+being developed by the [MLCommons Task Force on Automation and Reproducibility](../../../docs/taskforce.md)
+and [individual contributors](../CONTRIBUTING.md)
+
+
 
 # List of CM scripts by categories
 
@@ -23,6 +25,6 @@ design space exploration and deployment across continuously changing software an
 
 {{CM_MAIN}}
 
-# Community
+# Community developments
 
 * [Discord server](https://discord.gg/JjWNWXKxwT)
diff --git a/cm-mlops/script/app-loadgen-generic-python/README.md b/cm-mlops/script/app-loadgen-generic-python/README.md
@@ -132,6 +132,10 @@ ___
       - Environment variables:
         - *CM_MLPERF_BACKEND*: `onnxruntime`
       - Workflow:
+    * **`_pytorch`** (default)
+      - Environment variables:
+        - *CM_MLPERF_BACKEND*: `pytorch`
+      - Workflow:
 
     </details>
 
@@ -174,7 +178,7 @@ ___
 
 #### Default variations
 
-`_cpu,_onnxruntime`
+`_cpu,_onnxruntime,_pytorch`
 
 #### Script flags mapped to environment
 <details>
@@ -251,6 +255,9 @@ ___
      * get,ml-model,retinanet,_onnx,_fp32
        * `if (CM_MODEL  == retinanet)`
        - CM script: [get-ml-model-retinanet](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/get-ml-model-retinanet)
+     * get,ml-model,retinanet,_onnx,_fp32
+       * `if (CM_MODEL  == retinanet)`
+       - CM script: [get-ml-model-retinanet](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/get-ml-model-retinanet)
   1. ***Run "preprocess" function from [customize.py](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/app-loadgen-generic-python/customize.py)***
   1. Read "prehook_deps" on other CM scripts from [meta](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/app-loadgen-generic-python/_cm.yaml)
   1. ***Run native script if exists***

diff --git a/cm-mlops/script/benchmark-object-detection-loadgen/README.md b/cm-mlops/script/benchmark-object-detection-loadgen/README.md
@@ -0,0 +1,179 @@
+<details>
+<summary>Click here to see the table of contents.</summary>
+
+* [About](#about)
+* [Summary](#summary)
+* [Reuse this script in your project](#reuse-this-script-in-your-project)
+  * [ Install CM automation language](#install-cm-automation-language)
+  * [ Check CM script flags](#check-cm-script-flags)
+  * [ Run this script from command line](#run-this-script-from-command-line)
+  * [ Run this script from Python](#run-this-script-from-python)
+  * [ Run this script via GUI](#run-this-script-via-gui)
+  * [ Run this script via Docker (beta)](#run-this-script-via-docker-(beta))
+* [Customization](#customization)
+  * [ Variations](#variations)
+  * [ Default environment](#default-environment)
+* [Script workflow, dependencies and native scripts](#script-workflow-dependencies-and-native-scripts)
+* [Script output](#script-output)
+* [New environment keys (filter)](#new-environment-keys-(filter))
+* [New environment keys auto-detected from customize](#new-environment-keys-auto-detected-from-customize)
+* [Maintainers](#maintainers)
+
+</details>
+
+*Note that this README is automatically generated - don't edit!*
+
+### About
+
+
+See extra [notes](README-extra.md) from the authors and contributors.
+
+#### Summary
+
+* Category: *Benchmark object detection (loadgen, python, ONNX).*
+* CM GitHub repository: *[mlcommons@ck](https://github.com/mlcommons/ck/tree/master/cm-mlops)*
+* GitHub directory for this script: *[GitHub](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/benchmark-object-detection-loadgen)*
+* CM meta description for this script: *[_cm.yaml](_cm.yaml)*
+* CM "database" tags to find this script: *benchmark,object-detection,loadgen*
+* Output cached? *False*
+___
+### Reuse this script in your project
+
+#### Install CM automation language
+
+* [Installation guide](https://github.com/mlcommons/ck/blob/master/docs/installation.md)
+* [CM intro](https://doi.org/10.5281/zenodo.8105339)
+
+#### Pull CM repository with this automation
+
+```cm pull repo mlcommons@ck```
+
+
+#### Run this script from command line
+
+1. `cm run script --tags=benchmark,object-detection,loadgen[,variations] `
+
+2. `cmr "benchmark object-detection loadgen[ variations]" `
+
+* `variations` can be seen [here](#variations)
+
+#### Run this script from Python
+
+<details>
+<summary>Click here to expand this section.</summary>
+
+```python
+
+import cmind
+
+r = cmind.access({'action':'run'
+                  'automation':'script',
+                  'tags':'benchmark,object-detection,loadgen'
+                  'out':'con',
+                  ...
+                  (other input keys for this script)
+                  ...
+                 })
+
+if r['return']>0:
+    print (r['error'])
+
+```
+
+</details>
+
+
+#### Run this script via GUI
+
+```cmr "cm gui" --script="benchmark,object-detection,loadgen"```
+
+Use this [online GUI](https://cKnowledge.org/cm-gui/?tags=benchmark,object-detection,loadgen) to generate CM CMD.
+
+#### Run this script via Docker (beta)
+
+`cm docker script "benchmark object-detection loadgen[ variations]" `
+
+___
+### Customization
+
+
+#### Variations
+
+  * *No group (any variation can be selected)*
+    <details>
+    <summary>Click here to expand this section.</summary>
+
+    * `_cpu`
+      - Environment variables:
+        - *USE_CPU*: `True`
+      - Workflow:
+    * `_cuda`
+      - Environment variables:
+        - *USE_CUDA*: `True`
+      - Workflow:
+
+    </details>
+
+#### Default environment
+
+<details>
+<summary>Click here to expand this section.</summary>
+
+These keys can be updated via `--env.KEY=VALUE` or `env` dictionary in `@input.json` or using script flags.
+
+
+</details>
+
+___
+### Script workflow, dependencies and native scripts
+
+<details>
+<summary>Click here to expand this section.</summary>
+
+  1. ***Read "deps" on other CM scripts from [meta](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/benchmark-object-detection-loadgen/_cm.yaml)***
+     * detect,os
+       - CM script: [detect-os](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/detect-os)
+     * get,sys-utils-cm
+       - CM script: [get-sys-utils-cm](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/get-sys-utils-cm)
+     * get,target,device
+       - CM script: [get-target-device](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/get-target-device)
+     * get,python3
+       * CM names: `--adr.['python', 'python3']...`
+       - CM script: [get-python3](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/get-python3)
+     * get,cuda
+       * `if (USE_CUDA  == True)`
+       * CM names: `--adr.['cuda']...`
+       - CM script: [get-cuda](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/get-cuda)
+     * get,mlperf,inference,loadgen
+       * CM names: `--adr.['inference-src']...`
+       - CM script: [get-mlperf-inference-loadgen](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/get-mlperf-inference-loadgen)
+     * get,ml-model,object-detection
+       * CM names: `--adr.['ml-model']...`
+       - CM script: [get-ml-model-retinanet](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/get-ml-model-retinanet)
+     * get,generic-python-lib,_onnxruntime
+       * `if (USE_CUDA  != True)`
+       - CM script: [get-generic-python-lib](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/get-generic-python-lib)
+     * get,generic-python-lib,_onnxruntime_gpu
+       * `if (USE_CUDA  == True)`
+       - CM script: [get-generic-python-lib](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/get-generic-python-lib)
+  1. ***Run "preprocess" function from [customize.py](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/benchmark-object-detection-loadgen/customize.py)***
+  1. Read "prehook_deps" on other CM scripts from [meta](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/benchmark-object-detection-loadgen/_cm.yaml)
+  1. ***Run native script if exists***
+     * [run.bat](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/benchmark-object-detection-loadgen/run.bat)
+     * [run.sh](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/benchmark-object-detection-loadgen/run.sh)
+  1. Read "posthook_deps" on other CM scripts from [meta](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/benchmark-object-detection-loadgen/_cm.yaml)
+  1. ***Run "postrocess" function from [customize.py](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/benchmark-object-detection-loadgen/customize.py)***
+  1. Read "post_deps" on other CM scripts from [meta](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/benchmark-object-detection-loadgen/_cm.yaml)
+</details>
+
+___
+### Script output
+`cmr "benchmark object-detection loadgen[,variations]"  -j`
+#### New environment keys (filter)
+
+#### New environment keys auto-detected from customize
+
+___
+### Maintainers
+
+* [Open MLCommons taskforce on automation and reproducibility](https://github.com/mlcommons/ck/blob/master/docs/taskforce.md)