This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
@piiswrong
Description
Automatic OMP operator tuning based upon kernel operation workload.
Determines "weight" of a unary or binary kernel op and then uses this to determine if OMP should be used, given # of iterations required and # threads to perform the job.
Correct decision accuracy is tested in gtest OMP_TUNING test suite by comparing with OMP, without OMP, and Auto times.
For example:
AWS c4.8xlarge:
Success rate for type float: 0.90278
Success rate for type double: 0.88889
Success rate for type mshadow::half::half_t: 0.83333
Success rate for type unsigned char: 0.86111
Success rate for type int: 0.95833
Success rate for type long: 0.88889
desktop: 12-core (6 real CPU cores + hyperthreading)
Success rate for type float: 0.79167
Success rate for type double: 0.75000
Success rate for type unsigned char: 0.72222
Success rate for type int: 0.94444
Success rate for type long: 1.00000
A sample output from OMP_TUNING tests including staticstical data:
tune_all.txt
Currently autotuned kernel operators (tuning at startup takes a total of < 10ms):
mxnet::op::PopulateFullIdxRspKernel
mxnet::op::mxnet_op::set_to_int<0>
mxnet::op::mshadow_op::smooth_l1_gradient
mxnet::op::mshadow_op::smooth_l1_loss
mxnet::op::mshadow_op::eq
mxnet::op::mshadow_op::ne
mxnet::op::mshadow_op::le
mxnet::op::mshadow_op::lt
mxnet::op::mshadow_op::hypot_grad_right
mxnet::op::mshadow_op::hypot_grad_left
mxnet::op::mshadow_op::hypot
mxnet::op::mshadow_op::arctanh_grad
mxnet::op::mshadow_op::arctan_grad
mxnet::op::mshadow_op::cosh
mxnet::op::mshadow_op::rpower
mxnet::op::mshadow_op::minimum
mxnet::op::mshadow_op::arctan
mxnet::op::mshadow_op::reciprocal_square_root
mxnet::op::mshadow_op::rminus
mxnet::op::mshadow_op::arccosh_grad
mxnet::op::mshadow_op::square_root_grad
mxnet::op::mshadow_op::arctanh
mxnet::op::mshadow_op::floor
mxnet::op::mshadow_op::cosh_grad
mxnet::op::mshadow_op::ceil
mxnet::op::mshadow_op::cos_grad
mxnet::op::mshadow_op::reciprocal_cube_root_grad
mxnet::op::mshadow_op::arcsinh_grad
mxnet::op::mshadow_op::sin
mxnet::op::mshadow_op::arcsin
mxnet::op::mshadow_op::log10_grad
mxnet::op::mshadow_op::log1p_grad
mxnet::op::mshadow_op::mod_grad
mxnet::op::mshadow_op::arccos_grad
mxnet::op::mshadow_op::exp
mxnet::op::mshadow_op::tanh_grad
mxnet::op::mshadow_op::log1p
mxnet::op::mshadow_op::rint
mshadow::op::minus
mxnet::op::mshadow_op::relu_grad
mxnet::op::mshadow_op::identity
mxnet::op::mshadow_op::maximum
mxnet::op::mshadow_op::reciprocal_grad
mshadow::op::div
mxnet::op::mshadow_op::rmod_grad
mxnet::op::mshadow_op::arcsin_grad
mxnet::op::mshadow_op::ge
mxnet::op::mshadow_op::gammaln_grad
mxnet::op::mshadow_op::sigmoid
mxnet::op::mshadow_op::power_rgrad
mxnet::op::mshadow_op::identity_grad
mxnet::op::mshadow_op::tan
mxnet::op::mshadow_op::gamma
mxnet::op::mshadow_op::arcsinh
mshadow::op::identity
mxnet::op::mshadow_op::square_root
mxnet::op::mshadow_op::reciprocal_square_root_grad
mxnet::op::mshadow_op::cos
mxnet::op::mshadow_op::log2
mxnet::op::mshadow_op::tanh
mxnet::op::mshadow_op::arccosh
mxnet::op::mshadow_op::negation
mxnet::op::mshadow_op::log10
mxnet::op::mshadow_op::cube_root_grad
mxnet::op::mshadow_op::expm1
mxnet::op::mshadow_op::arccos
mxnet::op::mshadow_op::rmod
mxnet::op::mshadow_op::softrelu_grad
mxnet::op::mshadow_op::sinh
mxnet::op::mshadow_op::log_grad
mxnet::op::mshadow_op::sin_grad
mxnet::op::mshadow_op::rdiv_grad
mxnet::op::mshadow_op::log
mxnet::op::mshadow_op::softrelu
mxnet::op::mshadow_op::square_grad
mxnet::op::mshadow_op::log2_grad
mxnet::op::mshadow_op::cube_root
mxnet::op::mshadow_op::reciprocal_cube_root
mxnet::op::mshadow_op::sign
mxnet::op::mshadow_op::square
mxnet::op::mshadow_op::sign_grad
mxnet::op::mshadow_op::round
mxnet::op::mshadow_op::trunc
mxnet::op::mshadow_op::mod_rgrad
mxnet::op::mshadow_op::reciprocal
mxnet::op::mshadow_op::fix
mxnet::op::mshadow_op::gamma_grad
mxnet::op::mshadow_op::gammaln
mxnet::op::mshadow_op::degrees
mshadow::op::right
mxnet::op::mshadow_op::sinh_grad
mxnet::op::mshadow_op::degrees_grad
mshadow::op::plus
mxnet::op::mshadow_op::radians
mxnet::op::mshadow_op::sigmoid_grad
mxnet::op::mshadow_op::radians_grad
mxnet::op::mshadow_op::gt
mxnet::op::mshadow_op::mod
mshadow::op::mul
mxnet::op::mshadow_op::rdiv
mxnet::op::mshadow_op::tan_grad
mxnet::op::mshadow_op::div_grad
mxnet::op::mshadow_op::div_rgrad
mxnet::op::mshadow_op::left
mxnet::op::mshadow_op::right
mxnet::op::mshadow_op::power
mxnet::op::mshadow_op::power_grad
mxnet::op::mshadow_op::relu
mxnet::op::mshadow_op::abs
mxnet::op::mshadow_op::rpower_grad
Checklist
Essentials
make lint
)Changes
Comments