Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

fixed config.mk and Makefile bugs for installing mkl #15424

Merged
merged 2 commits into from
Jul 19, 2019

Conversation

nuslq
Copy link
Contributor

@nuslq nuslq commented Jul 1, 2019

Description

UPDATES:
Based on the discussion below with @TaoLv , I have removed the comments for USE_BLAS=mkl in this PR because after testing we find that we can statically link MKL BLAS library by adding USE_BLAS=mkl to the make command line.

UPDATES:
I tried difference ways to build mxnet from source by hiding each or both of the two changes I made below, and found that only changing lower case "use_blas" to capital format in Makefile would be able to build mxnet with "BLAS_MKL". The other change, adding "USE_BLAS = mkl" before lines 137 - 142, would not make any impact for this purpose (although I am not sure whether or not this change would make impact at other places).

One thing to clarify here is that I used "make" instead of "cmake" to build mxnet from source.

(Brief description on what this PR is about)
I have found two bugs in the Makefile file and make/config.mk file in apache/incubator-mxnet package. These bugs might lead to incorrect installation of Mxnet with MKL library.

Before the changes, I could not find "BLAS_MKL" in mxnet.runtime.Features() outputs. After rebuilt my mxnet with the following changes, I found "BLAS_MKL" in mxnet.runtime.Features() outputs.

In the make/config.mk file, USE_BLAS will be first set to "atlas" for linux or "apple" for osx by default as follows (lines 119 - 124)

UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S), Darwin)
USE_BLAS = apple
else
USE_BLAS = atlas
endif

Later, the USE_STATIC_MKL would be set "NONE" if users did not set USE_BLAS to "mkl" before the following lines (lines 137 - 142). This would not be corrected even if users set "USE_BLAS=mkl" at the end of the config.mk file or in the command line. (New test shows that USE_BLAS value, thereafter the USE_STATIC_MKL value, in config.mk can be overridden by adding USE_BLAS=mkl to the make command line)

# If use MKL only for BLAS, choose static link automatically to allow python wrapper
ifeq ($(USE_BLAS), mkl)
USE_STATIC_MKL = 1
else
USE_STATIC_MKL = NONE
endif

So, I correct this block to the follows, (removed this change for reason described above)

# If use MKL only for BLAS, choose static link automatically to allow python wrapper
# Please Note: You have to set USE_BLAS = mkl here if you want to build mxnet with mkl. Otherwise USE_STATIC_MKL will be set to NONE. 
# USE_BLAS = mkl 
ifeq ($(USE_BLAS), mkl)
USE_STATIC_MKL = 1
else
USE_STATIC_MKL = NONE
endif

In the Makefile, lines 247 - 255 as follows, I corrected the "use_blas" to capital form.

ifeq ($(use_blas), open)
	CFLAGS += -DMXNET_USE_BLAS_OPEN=1
else ifeq ($(use_blas), atlas)
	CFLAGS += -DMXNET_USE_BLAS_ATLAS=1
else ifeq ($(use_blas), mkl)
	CFLAGS += -DMXNET_USE_BLAS_MKL=1
else ifeq ($(use_blas), apple)
	CFLAGS += -DMXNET_USE_BLAS_APPLE=1
endif

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • In the make/config.mk file, add # USE_BLAS = mkl and comments for setting correct USE_STATIC_MKL as described in Description section
  • In the Makefile, lines 247 - 255 as follows, I corrected the "use_blas" to capital form

Comments

  • This change should be backward compatible.
  • Don't see any edge cases

@nuslq nuslq requested a review from szha as a code owner July 1, 2019 18:44
@szha
Copy link
Member

szha commented Jul 1, 2019

@larroy

Copy link
Contributor

@larroy larroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this, LGTM. As I understand, variables in Makefile are case sensitive, this PR makes sense to me.

@larroy
Copy link
Contributor

larroy commented Jul 1, 2019

One question that I have about your description, where do we define USE_STATIC_MKL? I see it in files define ad-hoc but not in the Makefile or config.mk, could you edit the description and clarify where USE_STATIC_MKL is updated or add a comment in Makefile or config.mk or documentation? I didn't see it in the patch, maybe I missed something.

$ ag USE_STATIC_MKL |less
make/readthedocs.mk:54:USE_STATIC_MKL = NONE
make/readthedocs.mk:74: USE_STATIC_MKL = 1
make/maven/maven_linux_mkl.mk:124:USE_STATIC_MKL = 1
make/maven/maven_linux_mkl.mk:126:USE_STATIC_MKL = NONE
make/maven/maven_linux_cu90mkl.mk:127:USE_STATIC_MKL = 1
make/maven/maven_linux_cu90mkl.mk:129:USE_STATIC_MKL = NONE
make/maven/maven_linux_cu92mkl.mk:127:USE_STATIC_MKL = 1
make/maven/maven_linux_cu92mkl.mk:129:USE_STATIC_MKL = NONE
make/maven/maven_darwin_mkl.mk:124:USE_STATIC_MKL = 1
make/maven/maven_darwin_mkl.mk:126:USE_STATIC_MKL = NONE
make/pip/pip_linux_cu91.mk:127:USE_STATIC_MKL = 1
make/pip/pip_linux_cu91.mk:129:USE_STATIC_MKL = NONE
make/pip/pip_linux_cu80mkl.mk:111:USE_STATIC_MKL = 1
make/pip/pip_linux_cu80mkl.mk:113:USE_STATIC_MKL = NONE
make/pip/pip_linux_cu80.mk:127:USE_STATIC_MKL = 1
make/pip/pip_linux_cu80.mk:129:USE_STATIC_MKL = NONE
make/pip/pip_linux_cu90.mk:127:USE_STATIC_MKL = 1
make/pip/pip_linux_cu90.mk:129:USE_STATIC_MKL = NONE
make/pip/pip_linux_cu91mkl.mk:111:USE_STATIC_MKL = 1
make/pip/pip_linux_cu91mkl.mk:113:USE_STATIC_MKL = NONE
make/pip/pip_linux_cu101.mk:127:USE_STATIC_MKL = 1
make/pip/pip_linux_cu101.mk:129:USE_STATIC_MKL = NONE
make/pip/pip_linux_cu75mkl.mk:108:USE_STATIC_MKL = 1
make/pip/pip_linux_cu75mkl.mk:110:USE_STATIC_MKL = NONE
make/pip/pip_linux_cu100mkl.mk:111:USE_STATIC_MKL = 1
make/pip/pip_linux_cu100mkl.mk:113:USE_STATIC_MKL = NONE
make/pip/pip_linux_cpu.mk:124:USE_STATIC_MKL = 1
make/pip/pip_linux_cpu.mk:126:USE_STATIC_MKL = NONE
make/pip/pip_linux_cu75.mk:124:USE_STATIC_MKL = 1
make/pip/pip_linux_cu75.mk:126:USE_STATIC_MKL = NONE
make/pip/pip_linux_cu100.mk:127:USE_STATIC_MKL = 1
make/pip/pip_linux_cu100.mk:129:USE_STATIC_MKL = NONE
make/pip/pip_darwin_mkl.mk:108:USE_STATIC_MKL = 1
make/pip/pip_darwin_mkl.mk:110:USE_STATIC_MKL = NONE
make/pip/pip_linux_mkl.mk:108:USE_STATIC_MKL = 1
make/pip/pip_linux_mkl.mk:110:USE_STATIC_MKL = NONE
make/pip/pip_darwin_cpu.mk:124:USE_STATIC_MKL = 1
make/pip/pip_darwin_cpu.mk:126:USE_STATIC_MKL = NONE
make/pip/pip_linux_cu92.mk:127:USE_STATIC_MKL = 1
make/pip/pip_linux_cu92.mk:129:USE_STATIC_MKL = NONE
make/pip/pip_linux_cu92mkl.mk:111:USE_STATIC_MKL = 1
make/pip/pip_linux_cu92mkl.mk:113:USE_STATIC_MKL = NONE
make/pip/pip_linux_cu90mkl.mk:111:USE_STATIC_MKL = 1
make/pip/pip_linux_cu90mkl.mk:113:USE_STATIC_MKL = NONE
make/pip/pip_linux_cu101mkl.mk:111:USE_STATIC_MKL = 1
make/pip/pip_linux_cu101mkl.mk:113:USE_STATIC_MKL = NONE
make/crosscompile.jetson.mk:130:USE_STATIC_MKL = 1
make/crosscompile.jetson.mk:132:USE_STATIC_MKL = NONE
3rdparty/mshadow/make/mshadow.mk:87:ifneq ($(USE_STATIC_MKL), NONE)

@anirudhacharya
Copy link
Member

@mxnet-label-bot add [pr-awaiting-review]

@TaoLv
Copy link
Member

TaoLv commented Jul 2, 2019

@nuslq Have you ever tried USE_BLAS=mkl in the make command line? I would expect MKL to be statically linked in this case.

@nuslq
Copy link
Contributor Author

nuslq commented Jul 2, 2019

@larroy, USE_STATIC_MKL is updated in config.mk between lines 137 - 142, which I had put in the description.

@nuslq
Copy link
Contributor Author

nuslq commented Jul 2, 2019

@TaoLv, Thanks for your suggestion!

After further experiments, I found that changing lower case "use_blas" to capital format in Makefile would be able to build mxnet with "BLAS_MKL". The other change of setting "USE_BLAS=mkl" at different places (e.g. in commad line, at the end of config.mk, or before lines 137 - 142 of config.mk) would not make impact for this purpose.

More specifically, without the changing lower case "use_blas" to capital format in Makefile, adding USE_BLAS=mkl in the make command line would not build mxnet with "BLAS_MKL".

One thing to clarify here is that I used "make" instead of "cmake" to build mxnet from source.

@TaoLv
Copy link
Member

TaoLv commented Jul 3, 2019

I think the lower cases BLAS check in Makefile was added for runtime feature detection. It's not really used for BLAS linkage. Please correct me if I'm wrong @larroy .
I tried USE_BLAS=mkl in make command line and can find MKL .a files in the link line and there is no MKL .so files in ldd output. Can you double check? @nuslq

@larroy
Copy link
Contributor

larroy commented Jul 3, 2019

Shouldn't USE_STATIC_MKL appear in the Makefile directly then?

@TaoLv
Copy link
Member

TaoLv commented Jul 3, 2019

@larroy it's passed to and used in mshadow.mk.

@larroy
Copy link
Contributor

larroy commented Jul 3, 2019

@TaoLv yes seems I introduced this, and was not correct due to case, apologies. This patch fixes that, I verified on mac and approved the PR, about MKL I think you are the best one to check that part. Since there's no real change to the former I already approved and verified. Thanks for the fix!

@nuslq
Copy link
Contributor Author

nuslq commented Jul 3, 2019

@TaoLv,
I built mxnet using command line "make -j $(nproc) USE_BLAS=mkl", and got the followings from "ldd lib/libmxnet.so | grep mkl",

libmkldnn.so.0 => /home/ubuntu/incubator-mxnet/lib/libmkldnn.so.0
libmklml_intel.so => /home/ubuntu/incubator-mxnet/lib/libmklml_intel.so

@TaoLv
Copy link
Member

TaoLv commented Jul 4, 2019

Thank you for confirming. @larroy

@nuslq, this is as expected. libmkldnn.so and libmklml_intel.so will be dynamically linked even USE_STATIC_MKL is true. Actually, they are not the MKL in USE_STATIC_MKL. Please refer to the logic here: https://github.com/dmlc/mshadow/blob/master/make/mshadow.mk#L87

  • USE_BLAS=mkl and USE_STATIC_MKL are for how to link MKL BLAS. They can be used even USE_MKLDNN=0 (in this case, you will not see the mkldnn.so and mklml_intel.so in ldd).
  • mklml_intel.so is introduced along with USE_MKLDNN=1 to improve the performance of mkldnn.so. mklml_intel cannot be statically linked as no .a file is provided.

Hope this explanation can address your questions. So the fix for lower case use_blas looks good. But the change and description for USE_STATIC_MKL does not make sense to me.

@roywei
Copy link
Member

roywei commented Jul 8, 2019

@mxnet-label-bot add [MKL, installation]

@yugoren
Copy link

yugoren commented Jul 8, 2019

Thanks for the explanation @TaoLv. Here are some questions inlined.

@nuslq, this is as expected. libmkldnn.so and libmklml_intel.so will be dynamically linked even USE_STATIC_MKL is true. Actually, they are not the MKL in USE_STATIC_MKL. Please refer to the logic here: https://github.com/dmlc/mshadow/blob/master/make/mshadow.mk#L87

As far as we can trace, the only instance where the MKL usage flag, MSHADOW_USE_MKL is set in CMake files but not in Makefile. Could you confirm that?

  • USE_BLAS=mkl and USE_STATIC_MKL are for how to link MKL BLAS. They can be used even USE_MKLDNN=0 (in this case, you will not see the mkldnn.so and mklml_intel.so in ldd).

  • mklml_intel.so is introduced along with USE_MKLDNN=1 to improve the performance of mkldnn.so. mklml_intel cannot be statically linked as no .a file is provided.

Hope this explanation can address your questions. So the fix for lower case use_blas looks good. But the change and description for USE_STATIC_MKL does not make sense to me.

I feel like there is more to understand which flags (as compiler and linker flags) are absolutely necessary for MxNet to compile with MKL BLAS support, using make or cmake. Moreover, I cannot find any place in mshadow code the compiler definition MSHADOW_USE_MKL, but only in MxNet operators, which is never set with Makefile (as far as I can trace). This was one of the reasons we wanted to remark on USE_STATIC_MKL since we couldn't find the aforementioned compiler definition set.

Could you or @larroy clarify what we're missing here?

@yugoren
Copy link

yugoren commented Jul 9, 2019

@yugoren Please find the definition of MSHADOW_USE_MKL at https://github.com/dmlc/mshadow/blob/master/mshadow/base.h#L88 and https://github.com/dmlc/mshadow/blob/master/make/mshadow.mk#L101

Thanks, that clears out how MKL is enabled by default in mshadow! Also, going back to the initial concern @TaoLv had about the comment we added: even if we pass USE_BLAS=mkl to make, the lines that @nuslq pointed out overrides the variable

UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S), Darwin)
USE_BLAS = apple
else
USE_BLAS = atlas
endif

That's why we felt it would be informative for the user to know exactly where to set the flag. Maybe we can change it to the following

UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S), Darwin)
USE_BLAS ?= apple
else
USE_BLAS ?= atlas
endif

@larroy What do you think?

Another question: is MSHADOW_USE_MKL enabled in mxnet build when mshadow/base.h is included in the files (maybe cascaded)? Could you also clarify why the flag is clearly set to 1 in cmake while set only through including a header file in make? This was one of the reason it was hard for me to track down how the flag is enabled in mxnet side and I was considering adding it into compiler definitions, probably in a separate PR.

@larroy
Copy link
Contributor

larroy commented Jul 9, 2019

Indeed this is messy and we should improve it, is difficult to reason about and track these flags. I suggest having this PR merged if anyone doesn't have any additional concerns to credit the contribution and adding additional PRs to improve the situation by clearly specifying the BLAS related flags in the compilation files and documenting them.

@TaoLv
Copy link
Member

TaoLv commented Jul 10, 2019

@yugoren Can you double check? I can statically link MKL BLAS library by adding USE_BLAS=mkl to the make command line.

@nuslq
Copy link
Contributor Author

nuslq commented Jul 16, 2019

@yugoren Can you double check? I can statically link MKL BLAS library by adding USE_BLAS=mkl to the make command line.

@TaoLv, Yes, we can statically link MKL BLAS library by adding USE_BLAS=mkl to the make command line. I have removed the comments for USE_BLAS=mkl in this PR.

Copy link
Member

@TaoLv TaoLv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nuslq thank you for the turn around. Approve now.
It would be better if you can also change the description of this PR:

This would not be corrected even if users set "USE_BLAS=mkl" at the end of the config.mk file or in the command line.

@nuslq
Copy link
Contributor Author

nuslq commented Jul 17, 2019

@nuslq thank you for the turn around. Approve now.
It would be better if you can also change the description of this PR:

This would not be corrected even if users set "USE_BLAS=mkl" at the end of the config.mk file or in the command line.

@TaoLv, updated the description

@TaoLv
Copy link
Member

TaoLv commented Jul 19, 2019

@nuslq Thank you for the changes. It's now merged.

@TaoLv TaoLv merged commit 45c360c into apache:master Jul 19, 2019
anirudhacharya pushed a commit to anirudhacharya/mxnet that referenced this pull request Aug 20, 2019
* fixed config.mk and Makefile bugs for installing mkl

* remove comments from the previous changes
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants