Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Fix the incorrect MKLDNN/MKL logic in cmake #14829

Closed
wants to merge 8 commits into from

Conversation

yinghu5
Copy link
Contributor

@yinghu5 yinghu5 commented Apr 29, 2019

Fix the mxnet cmake build systems to make MKLDNN, MKL as BLAS logic correct.

3 files : incubator-mxnet\cmake\ChooseBLAS.cmake
\incubator-mxnet\cmake\Modules\FindMKL.cmake
and Update incubator-mxnet\3rdparty\mshadow\cmake\mshadow.cmake to set BLAS #374
are updated.

Description

(Brief description on what this PR is about)

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • [ x] Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

fix the mxnet cmake build systems to make MKLDNN, MKL logic correct.
unlock MKLDNN and MKL dependency
@yinghu5 yinghu5 requested a review from szha as a code owner April 29, 2019 08:25
@yinghu5
Copy link
Contributor Author

yinghu5 commented Apr 29, 2019

We planned to update the 3 cmakefiles : ChooseBLAS.cmake, FindMKL.cmake and mshadow.cmake. The logic diagram is as the pdf. Red parts are original logic and green parts are new.

cmake_mxnet_fixed.pdf

  1. change use BLAS logic
  2. unlock the MKLDNN and MKL dependency
    MKL_mxnet

and we test both windows and linux,and get the build passed,
cmake_test.docx

thanks

@pengzhao-intel pengzhao-intel changed the title Yinghu5 cmake 1 Fix the incorrect MKLDNN/MKL logic in cmake Apr 29, 2019
@pengzhao-intel
Copy link
Contributor

@larroy @TaoLv @azai91

@pengzhao-intel
Copy link
Contributor

@yinghu5 do we need the build test in CI for these change?

@roywei
Copy link
Member

roywei commented Apr 29, 2019

@mxnet-label-bot add[CMake]

@marcoabreu marcoabreu added the CMake CMake related bugs/issues/improvements label Apr 29, 2019
@azai91
Copy link
Contributor

azai91 commented Apr 29, 2019

agree with pengzhao. we should add test to CI to check if flags link MKL and MKLDNN correctly.

@pengzhao-intel
Copy link
Contributor

@zixuanweeei please help verify the change in local

Add WIN_CPU_MKLDNN  WIN_CPU_MKLDNN_MKL
update for mkl/mkldnn cmake test
custom_steps.compile_windows_cpu_mkldnn(),
    custom_steps.compile_windows_cpu_mkldnn_mkl(),
    custom_steps.compile_windows_cpu_nomkldnn_mkl()
WIN_CPU_NOMKLDNN_MKL
@yinghu5
Copy link
Contributor Author

yinghu5 commented Apr 30, 2019

thank you @pengzhao-intel @azai91 , I try to add CI test first windows. and run into two issues.

  1. add WIN_CPU_MKLDNN. But it failed quickly at cmake configure .
    The C++ compiler
    "C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe"
    is not able to compile a simple test program.

It fails with the following output:
Change Dir: C:/jenkins_slave/workspace/build-cpu-mkldnn/build/CMakeFiles/CMakeTmp
Run Build Command:"jom" "/NOLOGO" "cmTC_14b0a\fast"
jom: parallel job execution disabled for Makefile

	C:\ProgramData\chocolatey\lib\jom\tools\jom.exe -f CMakeFiles\cmTC_14b0a.dir\build.make /nologo -L CMakeFiles\cmTC_14b0a.dir\build
  1. other configure like WIN_CPU_MKLDNN_MKL and WIN_CPU_NOMKLDNN_MKL can pass the steps. but here need MKL are installed in MKL_ROOT=C:\Program Files (x86)\IntelSWTools\compilers_and_libraries\windows\mkl.
    Please help make sure the CI windows have MKL_ROOT installed.

@yinghu5 yinghu5 closed this Apr 30, 2019
@yinghu5 yinghu5 reopened this Apr 30, 2019
@pengzhao-intel
Copy link
Contributor

@marcoabreu @KellenSunderland @larroy to help for the CI issue.

@larroy
Copy link
Contributor

larroy commented May 1, 2019

#14028

@larroy
Copy link
Contributor

larroy commented May 1, 2019

Can you please adjust ci/build_windows.py with the changes required for your modifications?

@larroy
Copy link
Contributor

larroy commented May 1, 2019

Is the control flow chart autogenerated if so how? does it reflect our current state or where we want to be at?

@larroy
Copy link
Contributor

larroy commented May 1, 2019

@lebeg

@larroy
Copy link
Contributor

larroy commented May 1, 2019

How can we be of additional help? Is because MKL is not found in win? I guess MKL is not installed in our windows AMI. @Chancebair

@Chancebair
Copy link
Contributor

MKL is not installed via the AMI baking process

@lebeg
Copy link
Contributor

lebeg commented May 2, 2019

But mklml is downloaded via the cmake script, see here. And MKL_ROOT is set in this line.

@@ -43,55 +43,7 @@ endif()
# ---[ Root folders
set(INTEL_ROOT "/opt/intel" CACHE PATH "Folder contains intel libs")

if(USE_MKLDNN)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though now it is convoluted in 1 file there is a configuration difference between full MKL and MKLML. For example, the librt library is present only in full MKL, OpenMP library needs to be configured explicitly for MKLML. Previously, I made an attempt to separate it to 2 different files, maybe you can reuse some code from there #11148

Copy link
Contributor Author

@yinghu5 yinghu5 May 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Chancebair . @lebeg thank you a lot for the review. Yes, we actually need full MKL package install on CI machine. There are mainly two factors , which

  1. MKLDNN will don't depend on the MKLML in the future,
  2. second, but important, MXNET will use more MKL functions other than in MKLML.

So we decide to unlock MKLDNN (MKLML) and MKL dependency, which means,
use MKLDNN and USE_BLAS=mkl completely independent.

  1. MKLDNN is on, it will download MKLML automatically. doesn't use full MKL package.
  2. when USE_BLAS=mkl, only use the full MKL. (MKL_ROOT). won't detect MKLML (from MKLDNN).

Thanks

@pengzhao-intel pengzhao-intel mentioned this pull request May 4, 2019
7 tasks
@TaoLv
Copy link
Member

TaoLv commented May 4, 2019

I remember we have build and test for USE_BLAS=mkl in CI. MKL is installed through apt-get on ubuntu: https://github.com/apache/incubator-mxnet/blob/master/ci/docker/install/ubuntu_mkl.sh#L28
Not sure how is it installed on Windows.

@TaoLv
Copy link
Member

TaoLv commented May 5, 2019

@Chancebair @marcoabreu Can you help to install a full MKL library for window CI? As showed in the changes of this PR, usually full MKL will be installed to C:\Program Files (x86)\IntelSWTools\compilers_and_libraries\windows\mkl on windows. You can get the executable here: http://registrationcenter-download.intel.com/akdlm/irc_nas/tec/15247/w_mkl_2019.3.203.exe

@yinghu5 yinghu5 closed this May 5, 2019
@yinghu5 yinghu5 deleted the yinghu5-cmake-1 branch May 5, 2019 06:32
@yinghu5
Copy link
Contributor Author

yinghu5 commented May 6, 2019

@larroy @lebeg @pengzhao-intel @TaoLv @Chancebair @marcoabreu . sorry i close the PR by accident when rebase my branch , and reopen it into #14877. please help to review there. thanks

@yinghu5
Copy link
Contributor Author

yinghu5 commented May 6, 2019

Is the control flow chart autogenerated if so how? does it reflect our current state or where we want to be at?

No, we draw it manually :)

@yinghu5
Copy link
Contributor Author

yinghu5 commented May 6, 2019

Can you please adjust ci/build_windows.py with the changes required for your modifications?

@larroy yes, we also adjust the ci in new #14877. thanks

@yinghu5
Copy link
Contributor Author

yinghu5 commented May 6, 2019

#14028 cmake and build OpenBLAS

@larroy, in the PR, we only change the MKLDNN and MKL part, and keep the OpenBLAS untouched.

About #14028 agree the latest solution, I will update the install guide of windows

  1. If you don't have the Intel Math Kernel Library (MKL) installed, download and install OpenBLAS. Note that you should also download ```mingw64.dll.zip`` along with openBLAS and add them to PATH.
    make sure the OpenBL to point to use version 0.3.6 instead.

@Chancebair
Copy link
Contributor

@TaoLv I created a ticket with my teams oncall to look into this

@pengzhao-intel
Copy link
Contributor

Please continuous review on #14877

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CMake CMake related bugs/issues/improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants