Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Contrib] Add MKL DNN option #4323

Merged
merged 3 commits into from
Nov 15, 2019
Merged

[Contrib] Add MKL DNN option #4323

merged 3 commits into from
Nov 15, 2019

Conversation

icemelon
Copy link
Member

Thanks for contributing to TVM! Please refer to guideline https://docs.tvm.ai/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.

list(APPEND RUNTIME_SRCS ${CBLAS_CONTRIB_SRC})
add_definitions(-DUSE_MKL_DNN=1)
message(STATUS "Use MKL DNN library " ${BLAS_LIBRARY_MKLDNN})
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the strategy if both USE_BLAS AND USE_MKL_DNN are set?

Copy link
Member Author

@icemelon icemelon Nov 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the logic here. MKL DNN will only be used when USE_BLAS is not none. When both are set, MKL DNN will be only used in the sgemm op as the library has limited support for the BLAS operators. And I find that MKL DNN kernel achieves better performance than MKL in TVM.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another minor nit is to change USE_MKL_DNN to either USE_DNNL or USE_MKLDNN. The first one aligns with the renaming trend of the library. The second one follows the coding convention in MKL-DNN before renaming and in other projects like MXNet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turnaround.

@@ -31,6 +31,9 @@ extern "C" {
#else
#include <cblas.h>
#endif
#if USE_MKL_DNN == 1
Copy link
Contributor

@ZhennanQin ZhennanQin Nov 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think code logic here need a small change as well:

#if USE_MKL_BLAS == 1
#include <mkl_cblas.h>
#elif USE_MKL_DNN == 1
#include <dnnl.h>
#else
#include <cblas.h>
#endif

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZhennanQin As @icemelon9 mentioned here, we need both cblas and dnnl because the latter is used for sgemm only.

Copy link
Contributor

@soiferj soiferj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Do you have perf comparisons for using MKL DNN's sgemm vs MKL's sgemm?

@@ -32,7 +32,7 @@ def _declaration_dense(cfg, data, weight, bias=None, out_dtype=None):
if "cblas" in target.libs:
C = cblas.matmul(data, weight, False, True)
if bias is not None:
C = tvm.compute(C.shape, lambda i, j: C[i, j] + bias[j].astype(out_dtype),
C = tvm.compute(C.shape, lambda i, j: C[i, j] + bias[j],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change doesn't seem to be related to mkl dnn?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is to fix the bug when using cblas library

Copy link
Contributor

@soiferj soiferj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should still build with MKLDNN If USE_MKL_DNN is ON and USE_BLAS is OFF, so we can support other MKLDNN ops.

Edit: nvm, looks like that’s the current logic

@icemelon
Copy link
Member Author

@soiferj No. If the USE_BLAS is OFF, we won't compile MKLDNN ops. See here in the cmake
https://github.com/apache/incubator-tvm/pull/4323/files#diff-9e0c77365976363b16f3bdc58f95fb38R59

One proposal is that I remove (NOT USE_BLAS STREQUAL "none") in the condition but I won't add ${CBLAS_CONTRIB_SRC} into src list if USE_BLAS is OFF. So if we have conv2d contrib using MKLDNN in the future, we can still include that file in the src list.

How does this sound?

@soiferj
Copy link
Contributor

soiferj commented Nov 14, 2019

I see, I think I was looking at an old version. I think that solution sounds good.

@tqchen tqchen merged commit 72821b2 into apache:master Nov 15, 2019
@tqchen
Copy link
Member

tqchen commented Nov 15, 2019

Thanks @icemelon9 @TaoLv @soiferj @gasgallo @minminsun @ZhennanQin !

zxy844288792 pushed a commit to zxy844288792/tvm that referenced this pull request Nov 15, 2019
* [Contrib] Add MKL DNN

* update

* update
zxy844288792 pushed a commit to zxy844288792/tvm that referenced this pull request Nov 15, 2019
* [Contrib] Add MKL DNN

* update

* update
@icemelon icemelon deleted the mkldnn branch November 22, 2019 00:02
kevinthesun pushed a commit to neo-ai/tvm that referenced this pull request Nov 25, 2019
* [TOPI][OP] Support Faster-RCNN Proposal OP on CPU (apache#4297)

* Support Proposal operator on CPU.

* PyLint space issue

* PyLint space issue

* Pylint singleton-comparison issue

* [QNN][Legalize] Specialize for Platforms without any fast Int8 arithmetic units. (apache#4307)

* fix error when memory_id is VTA_MEM_ID_OUT (apache#4330)

* [CI][DOCKER] Add ONNX runtime dep (apache#4314)

* [DOCKER] Add ONNX runtime dep

* Improve ci script

* [QNN] Quantize - Fixing the sequence of lowering. (apache#4316)

* [QNN] Use Int16 upcast in Fallback Conv2D. Fix test names. (apache#4329)

* [doc][fix] fix sphinx parsing for pass infra tutorial (apache#4337)

* change ci image version (apache#4313)

* [Codegen] remove fp16 function override for cuda  (apache#4331)

* add volatile override back

* [codegen] remove fp16 function override for cuda

* [CI] Set workspace to be per executor (apache#4336)

* [Build][Windows] Fix Windows build by including cctype (apache#4319)

* Fix build

* dummy change to retrigger CI

* dummy change to retrigger ci

* dummy change to retrigger ci

* Enable hipModuleGetGlobal() (apache#4321)

* [Relay][Pass] Add pass to remove unused functions in relay module (apache#4334)

* [Relay][Pass] Add pass to remove unused functions in relay module

* Add tests

* Fix lint

* Fix visit order

* Add pass argument

* Fix

* Add support for quant. mul operator in tflite frontend (apache#4283)

A test for qnn_mul has to be added when the qnn elemwise tests (apache#4282) get merged.

* Add topi.nn.fifo_buffer to TVM doc (apache#4343)

* Solve custom model of prelu (apache#4326)

* Deprecate NNVM warning msg (apache#4333)

* [Contrib] Add MKL DNN option (apache#4323)

* [Contrib] Add MKL DNN

* update

* update

* [Relay][Frontend][TF] Fix transpose when axes is not a param (apache#4327)

* [Relay][Frontend][TF] Use _infer_value_simulated when axes is not a const to Transpose

* uncomment tests

* dummy change to retrigger ci

* [RUNTIME] Add device query for AMD GcnArch (apache#4341)

* add gcnArch query

* kGcnArch query for cuda is a no-op

* [Test][Relay][Pass] Add test case for lambda lift (apache#4317)

* [Relay][Frontend][ONNX] operator support: DepthToSpace, SpaceToDepth (apache#4271)

* imp module is deprecated (apache#4275)

* [VTA] Bug fix for padded load with large inputs (apache#4293)

* bug fix for padded load with large inputs

* Update TensorLoad.scala

* Update test_vta_insn.py

* fix inconsistent tag name (apache#4134)

* [CodeGen] Add build config option disable_assert to control whether to generate assert (apache#4340)

* Bump up CUDA log version in tophub.py (apache#4347)

* Add check to ensure input file was successfully opened in NNVM deploy code demo (apache#4315)

* [COMMUNITY] Add DISCLAIMER, KEYS for ASF release (apache#4345)

* [COMMUNITY] Add DISCLAIMER, KEYS for ASF release

* Add file name spec

* [Relay][VM][Interpreter] Enable first-class constructors in VM and interpreter via eta expansion (apache#4218)

* Fix constructor pretty printing

* Make Module::HasDef name consistent with API

* Add VM constructor compilation via eta expansion

* Lint

* Fix CI

* Fix failing test

* Address comment

* Retrigger CI

* Retrigger CI

* Update dmlc_tvm_commit_id.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants