Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[OpPerf] Consolidate array manipulation related operators #17487

Merged
merged 21 commits into from
Mar 10, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion benchmark/opperf/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,8 @@ Hence, in this utility, we will build the functionality to allow users and devel
Provided you have MXNet installed (any version >= 1.5.1), all you need to use opperf utility is to add path to your cloned MXNet repository to the PYTHONPATH.

Note:
To install MXNet, refer [Installing MXNet page](https://mxnet.apache.org/versions/master/install/index.html)
1. Currently, opperf utility requires a cloned mxnet repo. It isn't supported on PyPi binary yet. [Work in Progress]
2. To install MXNet, refer [Installing MXNet page](https://mxnet.apache.org/versions/master/install/index.html)

```
export PYTHONPATH=$PYTHONPATH:/path/to/incubator-mxnet/
Expand All @@ -72,6 +73,9 @@ python incubator-mxnet/benchmark/opperf/opperf.py --output-format json --output-

3. **dtype** : By default, `float32`. You can override and set the global dtype for all operator benchmarks. Example: --dtype float64.

4. **profiler** : `native` or `python`. By default, 'native'. You can override and set the global profiler for all operator benchmarks. Example: --profiler 'python'.
Native profiler uses MXNet C++ based built-in profiler. Python profiler uses Python package time. Generally, native profiler is used by developers and python profiler is used by users.

## Usecase 2 - Run benchmarks for all the operators in a specific category

For example, you want to run benchmarks for all NDArray Broadcast Binary Operators, Ex: broadcast_add, broadcast_mod, broadcast_pow etc., You just run the following python script.
Expand Down Expand Up @@ -117,6 +121,7 @@ add_res = run_performance_test(nd.add, run_backward=True, dtype='float32', ctx=m
inputs=[{"lhs": (1024, 1024),
"rhs": (1024, 1024)}],
warmup=10, runs=25)
print(add_res)
```

Output for the above benchmark run, on a CPU machine, would look something like below:
Expand All @@ -143,6 +148,7 @@ add_res = run_performance_test([nd.add, nd.subtract], run_backward=True, dtype='
inputs=[{"lhs": (1024, 1024),
"rhs": (1024, 1024)}],
warmup=10, runs=25)
print(add_res)
```

Output for the above benchmark run, on a CPU machine, would look something like below:
Expand Down
107 changes: 7 additions & 100 deletions benchmark/opperf/nd_operations/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,103 +19,10 @@

**NOTE:** This list is AUTOGENERATED when you run opperf.py utility

0. LogisticRegressionOutput
1. broadcast_axes
2. ravel_multi_index
3. multi_sgd_mom_update
4. smooth_l1
5. scatter_nd
6. reshape
7. one_hot
8. linalg_potri
10. multi_sgd_update
12. Convolution_v1
13. repeat
14. Custom
15. softmax_cross_entropy
16. SwapAxis
17. norm
18. Softmax
20. fill_element_0index
21. cast
22. UpSampling
23. BatchNorm_v1
24. CTCLoss
25. LRN
26. cast_storage
27. pick
28. GridGenerator
29. sample_multinomial
30. Activation
31. LinearRegressionOutput
32. Pooling_v1
34. Crop
35. ElementWiseSum
36. diag
37. Reshape
38. Pad
39. linalg_gemm2
40. crop
43. RNN
45. SoftmaxOutput
46. linalg_extractdiag
48. SequenceLast
51. SequenceReverse
53. SVMOutput
54. linalg_trsm
55. where
56. SoftmaxActivation
58. slice
59. linalg_gelqf
60. softmin
61. linalg_gemm
62. BilinearSampler
64. choose_element_0index
65. tile
67. gather_nd
69. SequenceMask
70. reshape_like
71. slice_axis
72. stack
74. khatri_rao
75. multi_mp_sgd_update
76. linalg_sumlogdiag
77. broadcast_to
78. IdentityAttachKLSparseReg
80. SpatialTransformer
81. Concat
82. uniform
83. InstanceNorm
84. expand_dims
85. multi_mp_sgd_mom_update
86. reverse
87. add_n
88. clip
89. ctc_loss
90. shape_array
91. unravel_index
92. linalg_potrf
93. Cast
94. broadcast_like
95. Embedding
96. linalg_makediag
98. linalg_syrk
99. squeeze
101. ROIPooling
103. SliceChannel
104. slice_like
106. linalg_maketrian
108. pad
109. LayerNorm
110. split
111. MAERegressionOutput
112. Correlation
114. batch_take
115. L2Normalization
116. broadcast_axis
117. linalg_trmm
118. linalg_extracttrian
119. normal
120. take
121. MakeLoss
124. concat
0. preloaded_multi_sgd_update
1. multi_mp_sgd_mom_update
2. IdentityAttachKLSparseReg
3. unravel_index
4. mp_lamb_update_phase1
5. mp_lamb_update_phase2
6. scatter_nd
Loading