entire codebase build with mshadow_use_clas=0 #7625

DickJC123 · 2017-08-25T23:36:07Z

The following file is the result of solving build issues for the entire codebase that arise after setting MSHADOW_USE_CBLAS=0 in mshadow/make/mshadow.mk. Problems were often that cpu versions of the linalg interface were missing at final link-time, and this PR supplies stubs that log a fatal message that the routine is unimplemented. In addition, I reworked the linalg_gemm<cpu, DType> specialization to be more consistent with the rest of the file- a '#define' that is instantiated with float and double types. This clearly supplies a full function specialization (avoiding the partial function specialization error reported against an earlier version of the file by @chinakook). Also, I personally find the existing implementation of a routine that is templated on <xpu>, yet never refers to xpu, as confusing. Again, this PR offers a simple approach consistent with the rest of the file. @piiswrong @yajiedesign @asmushetzel

asmushetzel · 2017-08-26T21:24:53Z

Hi Dick,
much better that way. Thanks a lot.
As you have looked quite a bit into this file, I wonder whether you are the best person to talk to about one other issue: Is there any way to make the GPU-batch versions of operators as trmm/potrf/potri where cuBlas/cuSolver do not supply batch mode operations more efficient? Is there anyone at NVidia who can take a look at this? It is important as a very common use case of these operations is batch mode processing with a lot of small matrices. And that won't be that great in performance on GPU w/ the current way of naive batch processing.

DickJC123 · 2017-08-26T23:02:22Z

It's our raison d'etre to make frameworks like MXNet get the most out of NVIDIA GPUs. To help juggle priorities though, what are the most popular models that would make use of your proposed GPU-batch trmm/potrf/potri?

* 1x1 convolution acceleration * GEMM directly without im2col or col2im in 1x1 convolution(stride=1,pad=0). The 1x1 convolution is used very common in modern CNN networks such as Googlenet/Inception/Resnet/Mobilenet etc. * cpplint * fix linalg_impl (#7611) * fix linalg_impl * fix * fix * fix * set build status to success only after job ends (#7628) Earlier code marks status as success initially. So any new PR shows jenkins status as success if we see the check mark on github. On opening the full build status, we see that builds haven't even started or are running. If something fails, variable changes to failure then. So even without this merge, a red mark on github indicates that build has failed correctly. That behavior is unchanged. * Fix build status of a test (#7629) installs bc required by sh2ju.sh and changes the regex match to capital alphabet as it clashes with a warning thrown by opencv driver * entire codebase build with mshadow_use_clas=0 (#7625) * Update README.md (#7630) * unit test for csv iter, doc update for libsvmiter (#7623) * add unit test for csv iter * fix lint * add libsvm to mxnet.io doc * update libsvm doc * gpu access of ndarray (#7496) * gpu access of ndarray * gpu access from C++ api * gpu access fix * Update c_api.cc * Update c_api.cc * refactor cudnn algo reg to no use string (#7561) * refactor cudnn algo reg to no use string * refactor ctx list * fix * refactor save_inputs * Update io.md (#7634) * fix tests (#7633) * [build] explicitly install JDK8 (#7574) * explicitly install openjdk8 * handle earlier version of ubuntu * install software-properties-common * update -y * update commands * Indents correction * Add script to build doc files for all versions (#7636) * Add script to build doc files for all versions * Fix * Use add versipn script of each different version * add fashion mnist and move mnists to s3 (#7635) * add fashion mnist and move mnists to s3 * refactor * add doc for dataset (#7644) * Change apache package URL to https (#7622) * Pip installer for CoreML Converter: mxnet-to-coreml (#7624) * Fixing CoreML converter's README: typos/grammar/etc. * CoreML converter README update: Talk about layers first and then about models. * Providing examples on converting various standard models; calling out issues with InceptionV3. * Fixing CoreML converter's README: typos/grammar/etc. * CoreML converter README update: Talk about layers first and then about models. * Providing examples on converting various standard models; calling out issues with InceptionV3. * Pip installer for converter: mxnet-coreml-converter. Runs only on MacOS and python 2.7. Once inside the directory pip_package, user needs to run: python setup.py bdist_wheel twine upload dist/* Once uploaded it'll look like this: https://testpypi.python.org/pypi/mxnet-coreml-converter Also updated the README for converter to reflect this. Note that we are going with a package per tool for the time being. Please leave feedback if you think it is better to adopt the policy of all the tools in one single package. Unit tests continue to pass. * More informative pypi package documentation. * Updating MacOS in release notes to 10.11 after testing on it. * Changing the name to mxnet-to-coreml and version to 0.1.0. * Added license to setup.py * Updating readme files with the correct pip package name. * Parallelize windows unit tests of python 2 and 3 in jenkins (#7646) * parallelize python windows tests * reordered for clarity * Removed asset loaded insecurely and added the asset to be loaded from the origin securely (#7649) * skip failing test temporarily (#7648) * lower really high threshold to fix test failure (#7650) * Doc updates for install and doc generation (#7647) * fluent (#7584) * add 1x1 convolution to tests * indent * Refactor random linalg contrib namespaces (#7604) * Refactor namespaces contrib, linalg, random, and sparse for op registration Change examples in documentation Change namespace usage in examples Fix pylint Remove unused import Switch name and alias in linalg and random Change stype comparison from string to int for functions used internally Change documentation to use the right namespace Register ops under ndarray/op.py and symbol/op.py Remove unused import Change .cu op names * Add __all__ to ndarray and symbol modules * Revert "Add __all__ to ndarray and symbol modules" This reverts commit 8bc5de7. * Add __all__ to ndarray and symbol modules * fix gluon fasionmnist dataset (#7655) fix gluon fasionmnist dataset * Parallelize Python 2 and 3 unit test cases in Jenkins CI. (#7658) * Parallelize Python 2 and 3 unit test cases. * Parallelize python 2 and 3 unit tests cases in jenkins * Parallelize python 2 and 3 unit tests cases in jenkins * Change namespace and make logging functionality changes (#7627) * Change namespace and make logging functionality changes * Help comment changes * update mklml and mkl mac support (#7587) * 1x1 convolution acceleration * GEMM directly without im2col or col2im in 1x1 convolution(stride=1,pad=0). The 1x1 convolution is used very common in modern CNN networks such as Googlenet/Inception/Resnet/Mobilenet etc. * cpplint * Indents correction * add 1x1 convolution to tests * indent * 1x1 convolution acceleration * GEMM directly without im2col or col2im in 1x1 convolution(stride=1,pad=0). The 1x1 convolution is used very common in modern CNN networks such as Googlenet/Inception/Resnet/Mobilenet etc. * cpplint * Indents correction * add 1x1 convolution to tests * indent * cpplint * indent

* 1x1 convolution acceleration * GEMM directly without im2col or col2im in 1x1 convolution(stride=1,pad=0). The 1x1 convolution is used very common in modern CNN networks such as Googlenet/Inception/Resnet/Mobilenet etc. * cpplint * fix linalg_impl (apache#7611) * fix linalg_impl * fix * fix * fix * set build status to success only after job ends (apache#7628) Earlier code marks status as success initially. So any new PR shows jenkins status as success if we see the check mark on github. On opening the full build status, we see that builds haven't even started or are running. If something fails, variable changes to failure then. So even without this merge, a red mark on github indicates that build has failed correctly. That behavior is unchanged. * Fix build status of a test (apache#7629) installs bc required by sh2ju.sh and changes the regex match to capital alphabet as it clashes with a warning thrown by opencv driver * entire codebase build with mshadow_use_clas=0 (apache#7625) * Update README.md (apache#7630) * unit test for csv iter, doc update for libsvmiter (apache#7623) * add unit test for csv iter * fix lint * add libsvm to mxnet.io doc * update libsvm doc * gpu access of ndarray (apache#7496) * gpu access of ndarray * gpu access from C++ api * gpu access fix * Update c_api.cc * Update c_api.cc * refactor cudnn algo reg to no use string (apache#7561) * refactor cudnn algo reg to no use string * refactor ctx list * fix * refactor save_inputs * Update io.md (apache#7634) * fix tests (apache#7633) * [build] explicitly install JDK8 (apache#7574) * explicitly install openjdk8 * handle earlier version of ubuntu * install software-properties-common * update -y * update commands * Indents correction * Add script to build doc files for all versions (apache#7636) * Add script to build doc files for all versions * Fix * Use add versipn script of each different version * add fashion mnist and move mnists to s3 (apache#7635) * add fashion mnist and move mnists to s3 * refactor * add doc for dataset (apache#7644) * Change apache package URL to https (apache#7622) * Pip installer for CoreML Converter: mxnet-to-coreml (apache#7624) * Fixing CoreML converter's README: typos/grammar/etc. * CoreML converter README update: Talk about layers first and then about models. * Providing examples on converting various standard models; calling out issues with InceptionV3. * Fixing CoreML converter's README: typos/grammar/etc. * CoreML converter README update: Talk about layers first and then about models. * Providing examples on converting various standard models; calling out issues with InceptionV3. * Pip installer for converter: mxnet-coreml-converter. Runs only on MacOS and python 2.7. Once inside the directory pip_package, user needs to run: python setup.py bdist_wheel twine upload dist/* Once uploaded it'll look like this: https://testpypi.python.org/pypi/mxnet-coreml-converter Also updated the README for converter to reflect this. Note that we are going with a package per tool for the time being. Please leave feedback if you think it is better to adopt the policy of all the tools in one single package. Unit tests continue to pass. * More informative pypi package documentation. * Updating MacOS in release notes to 10.11 after testing on it. * Changing the name to mxnet-to-coreml and version to 0.1.0. * Added license to setup.py * Updating readme files with the correct pip package name. * Parallelize windows unit tests of python 2 and 3 in jenkins (apache#7646) * parallelize python windows tests * reordered for clarity * Removed asset loaded insecurely and added the asset to be loaded from the origin securely (apache#7649) * skip failing test temporarily (apache#7648) * lower really high threshold to fix test failure (apache#7650) * Doc updates for install and doc generation (apache#7647) * fluent (apache#7584) * add 1x1 convolution to tests * indent * Refactor random linalg contrib namespaces (apache#7604) * Refactor namespaces contrib, linalg, random, and sparse for op registration Change examples in documentation Change namespace usage in examples Fix pylint Remove unused import Switch name and alias in linalg and random Change stype comparison from string to int for functions used internally Change documentation to use the right namespace Register ops under ndarray/op.py and symbol/op.py Remove unused import Change .cu op names * Add __all__ to ndarray and symbol modules * Revert "Add __all__ to ndarray and symbol modules" This reverts commit 8bc5de7. * Add __all__ to ndarray and symbol modules * fix gluon fasionmnist dataset (apache#7655) fix gluon fasionmnist dataset * Parallelize Python 2 and 3 unit test cases in Jenkins CI. (apache#7658) * Parallelize Python 2 and 3 unit test cases. * Parallelize python 2 and 3 unit tests cases in jenkins * Parallelize python 2 and 3 unit tests cases in jenkins * Change namespace and make logging functionality changes (apache#7627) * Change namespace and make logging functionality changes * Help comment changes * update mklml and mkl mac support (apache#7587) * 1x1 convolution acceleration * GEMM directly without im2col or col2im in 1x1 convolution(stride=1,pad=0). The 1x1 convolution is used very common in modern CNN networks such as Googlenet/Inception/Resnet/Mobilenet etc. * cpplint * Indents correction * add 1x1 convolution to tests * indent * 1x1 convolution acceleration * GEMM directly without im2col or col2im in 1x1 convolution(stride=1,pad=0). The 1x1 convolution is used very common in modern CNN networks such as Googlenet/Inception/Resnet/Mobilenet etc. * cpplint * Indents correction * add 1x1 convolution to tests * indent * cpplint * indent

entire codebase build with mshadow_use_clas=0

80d90de

piiswrong merged commit 50342a4 into apache:master Aug 26, 2017

mbaijal pushed a commit to mbaijal/incubator-mxnet that referenced this pull request Sep 6, 2017

entire codebase build with mshadow_use_clas=0 (apache#7625)

d30ca59

cjolivier01 pushed a commit to cjolivier01/mxnet that referenced this pull request Sep 11, 2017

entire codebase build with mshadow_use_clas=0 (apache#7625)

b40059f

crazy-cat pushed a commit to crazy-cat/incubator-mxnet that referenced this pull request Oct 26, 2017

entire codebase build with mshadow_use_clas=0 (apache#7625)

6a07598

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

entire codebase build with mshadow_use_clas=0 #7625

entire codebase build with mshadow_use_clas=0 #7625

DickJC123 commented Aug 25, 2017 •

edited

Loading

asmushetzel commented Aug 26, 2017

DickJC123 commented Aug 26, 2017

entire codebase build with mshadow_use_clas=0 #7625

entire codebase build with mshadow_use_clas=0 #7625

Conversation

DickJC123 commented Aug 25, 2017 • edited Loading

asmushetzel commented Aug 26, 2017

DickJC123 commented Aug 26, 2017

DickJC123 commented Aug 25, 2017 •

edited

Loading