Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Commit

Permalink
[MKLDNN] Enable subgraph backend mkldnn by default. (#15518)
Browse files Browse the repository at this point in the history
* Enable subgraph backend mkldnn by default

* Fix lint

* fix ut

* fix scala test

* Fix UT

* Fix lint

* Run CI

* Run CI

* Support MXNET_MKLDNN_ENABLED

* Fix merge

* Run CI
  • Loading branch information
ZhennanQin authored and pengzhao-intel committed Jul 26, 2019
1 parent b00bb81 commit e98fea3
Show file tree
Hide file tree
Showing 19 changed files with 504 additions and 291 deletions.
3 changes: 0 additions & 3 deletions cpp-package/example/inference/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ The following performance numbers are collected via using C++ inference API on A
```
export KMP_AFFINITY=granularity=fine,noduplicates,compact,1,0
export OMP_NUM_THREADS=$(vCPUs/2)
export MXNET_SUBGRAPH_BACKEND=MKLDNN
export MXNET_ENGINE_TYPE=NaiveEngine
```
Also users are recommended to use ```numactl``` or ```taskset``` to bind a running process to the specified cores.
Expand Down Expand Up @@ -87,8 +86,6 @@ Follow the below steps to do inference with more models.

The below command lines show how to run inference with FP32/INT8 resnet50_v1 model. Because the C++ inference script provides the almost same command line as this [Python script](https://github.com/apache/incubator-mxnet/blob/master/example/quantization/imagenet_inference.py) and then users can easily go from Python to C++.
```
# set MKLDNN as subgraph backend
export MXNET_SUBGRAPH_BACKEND=MKLDNN
# FP32 inference
./imagenet_inference --symbol_file "./model/resnet50_v1-symbol.json" --params_file "./model/resnet50_v1-0000.params" --dataset "./data/val_256_q90.rec" --rgb_mean "123.68 116.779 103.939" --rgb_std "58.393 57.12 57.375" --batch_size 64 --num_skipped_batches 50 --num_inference_batches 500
Expand Down
3 changes: 2 additions & 1 deletion docs/faq/env_var.md
Original file line number Diff line number Diff line change
Expand Up @@ -307,9 +307,10 @@ If ctypes is used, it must be `mxnet._ctypes.ndarray.NDArrayBase`.
- This variable controls how many CuDNN dropout state resources to create for each GPU context for use in operator.

* MXNET_SUBGRAPH_BACKEND
- Values: String ```(default="")```
- Values: String ```(default="MKLDNN")``` if MKLDNN is avaliable, otherwise ```(default="")```
- This variable controls the subgraph partitioning in MXNet.
- This variable is used to perform MKL-DNN FP32 operator fusion and quantization. Please refer to the [MKL-DNN operator list](../tutorials/mkldnn/operator_list.md) for how this variable is used and the list of fusion passes.
- Set ```MXNET_SUBGRAPH_BACKEND=NONE``` to disable subgraph backend.

* MXNET_SAFE_ACCUMULATION
- Values: Values: 0(false) or 1(true) ```(default=0)```
Expand Down
11 changes: 10 additions & 1 deletion docs/tutorials/c++/subgraphAPI.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,15 @@ There're 2 built-in attributes that used by MXNet executor.

`inference_only` : bool, apply this property only for inference. Property will be skiped when need_grad=True. Default `false` if this attribute isn't defined.

After defining the subgraph property, we need to register it in .cc file.
After defining the subgraph property, we need to register it under a backend in .cc file.

Firstly, we need to register the backend

```C++
MXNET_REGISTER_SUBGRAPH_BACKEND(SgTest);
```
Then register the property under it.
```C++
MXNET_REGISTER_SUBGRAPH_PROPERTY(SgTest, SgProperty);
Expand All @@ -124,6 +132,7 @@ It's possible to register multiple properties for same backend. In practice, we
#include "SgProperty2.h" // Define SgProperty2 class
#include "SgProperty3.h" // Define SgProperty3 class

MXNET_REGISTER_SUBGRAPH_BACKEND(SgTest);
MXNET_REGISTER_SUBGRAPH_PROPERTY(SgTest, SgProperty); // Execution order 1.
MXNET_REGISTER_SUBGRAPH_PROPERTY(SgTest, SgProperty2); // Execution order 2.
MXNET_REGISTER_SUBGRAPH_PROPERTY(SgTest, SgProperty3); // Execution order 3.
Expand Down
18 changes: 9 additions & 9 deletions docs/tutorials/mkldnn/MKLDNN_README.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ LIBRARY_PATH=$(brew --prefix llvm)/lib/ make -j $(sysctl -n hw.ncpu) CC=$(brew -
<h2 id="3">Windows</h2>

On Windows, you can use [Micrsoft Visual Studio 2015](https://www.visualstudio.com/vs/older-downloads/) and [Microsoft Visual Studio 2017](https://www.visualstudio.com/downloads/) to compile MXNet with Intel MKL-DNN.
[Micrsoft Visual Studio 2015](https://www.visualstudio.com/vs/older-downloads/) is recommended.
[Micrsoft Visual Studio 2015](https://www.visualstudio.com/vs/older-downloads/) is recommended.

**Visual Studio 2015**

Expand All @@ -113,8 +113,8 @@ To build and install MXNet yourself, you need the following dependencies. Instal
2. Download and Install [CMake 3](https://cmake.org/files/v3.14/cmake-3.14.0-win64-x64.msi) if it is not already installed.
3. Download [OpenCV 3](https://sourceforge.net/projects/opencvlibrary/files/3.4.5/opencv-3.4.5-vc14_vc15.exe/download), and unzip the OpenCV package, set the environment variable ```OpenCV_DIR``` to point to the ```OpenCV build directory``` (e.g.,```OpenCV_DIR = C:\opencv\build ```). Also, add the OpenCV bin directory (```C:\opencv\build\x64\vc14\bin``` for example) to the ``PATH`` variable.
4. If you have Intel Math Kernel Library (Intel MKL) installed, set ```MKL_ROOT``` to point to ```MKL``` directory that contains the ```include``` and ```lib```. If you want to use MKL blas, you should set ```-DUSE_BLAS=mkl``` when cmake. Typically, you can find the directory in ```C:\Program Files (x86)\IntelSWTools\compilers_and_libraries\windows\mkl```.
5. If you don't have the Intel Math Kernel Library (MKL) installed, download and install [OpenBLAS](http://sourceforge.net/projects/openblas/files/v0.2.14/), or build the latest version of OpenBLAS from source. Note that you should also download ```mingw64.dll.zip``` along with openBLAS and add them to PATH.
6. Set the environment variable ```OpenBLAS_HOME``` to point to the ```OpenBLAS``` directory that contains the ```include``` and ```lib``` directories. Typically, you can find the directory in ```C:\Downloads\OpenBLAS\```.
5. If you don't have the Intel Math Kernel Library (MKL) installed, download and install [OpenBLAS](http://sourceforge.net/projects/openblas/files/v0.2.14/), or build the latest version of OpenBLAS from source. Note that you should also download ```mingw64.dll.zip``` along with openBLAS and add them to PATH.
6. Set the environment variable ```OpenBLAS_HOME``` to point to the ```OpenBLAS``` directory that contains the ```include``` and ```lib``` directories. Typically, you can find the directory in ```C:\Downloads\OpenBLAS\```.

After you have installed all of the required dependencies, build the MXNet source code:

Expand All @@ -123,17 +123,17 @@ After you have installed all of the required dependencies, build the MXNet sourc
git clone --recursive https://github.com/apache/incubator-mxnet.git
cd C:\incubator-mxent
```
2. Enable Intel MKL-DNN by -DUSE_MKLDNN=1. Use [CMake 3](https://cmake.org/) to create a Visual Studio solution in ```./build```. Make sure to specify the architecture in the
2. Enable Intel MKL-DNN by -DUSE_MKLDNN=1. Use [CMake 3](https://cmake.org/) to create a Visual Studio solution in ```./build```. Make sure to specify the architecture in the
command:
```
>mkdir build
>cd build
>cmake -G "Visual Studio 14 Win64" .. -DUSE_CUDA=0 -DUSE_CUDNN=0 -DUSE_NVRTC=0 -DUSE_OPENCV=1 -DUSE_OPENMP=1 -DUSE_PROFILER=1 -DUSE_BLAS=open -DUSE_LAPACK=1 -DUSE_DIST_KVSTORE=0 -DCUDA_ARCH_NAME=All -DUSE_MKLDNN=1 -DCMAKE_BUILD_TYPE=Release
```
3. Enable Intel MKL-DNN and Intel MKL as BLAS library by the command:
3. Enable Intel MKL-DNN and Intel MKL as BLAS library by the command:
```
>"C:\Program Files (x86)\IntelSWTools\compilers_and_libraries\windows\mkl\bin\mklvars.bat" intel64
>cmake -G "Visual Studio 14 Win64" .. -DUSE_CUDA=0 -DUSE_CUDNN=0 -DUSE_NVRTC=0 -DUSE_OPENCV=1 -DUSE_OPENMP=1 -DUSE_PROFILER=1 -DUSE_BLAS=mkl -DUSE_LAPACK=1 -DUSE_DIST_KVSTORE=0 -DCUDA_ARCH_NAME=All -DUSE_MKLDNN=1 -DCMAKE_BUILD_TYPE=Release -DMKL_ROOT="C:\Program Files (x86)\IntelSWTools\compilers_and_libraries\windows\mkl"
>cmake -G "Visual Studio 14 Win64" .. -DUSE_CUDA=0 -DUSE_CUDNN=0 -DUSE_NVRTC=0 -DUSE_OPENCV=1 -DUSE_OPENMP=1 -DUSE_PROFILER=1 -DUSE_BLAS=mkl -DUSE_LAPACK=1 -DUSE_DIST_KVSTORE=0 -DCUDA_ARCH_NAME=All -DUSE_MKLDNN=1 -DCMAKE_BUILD_TYPE=Release -DMKL_ROOT="C:\Program Files (x86)\IntelSWTools\compilers_and_libraries\windows\mkl"
```
4. After the CMake successfully completed, in Visual Studio, open the solution file ```.sln``` and compile it, or compile the MXNet source code by using following command:
```r
Expand All @@ -154,7 +154,7 @@ User can follow the same steps of Visual Studio 2015 to build MXNET with MKL-DNN

<h2 id="4">Verify MXNet with python</h2>

Preinstall python and some dependent modules:
Preinstall python and some dependent modules:
```
pip install numpy graphviz
set PYTHONPATH=[workdir]\incubator-mxnet\python
Expand Down Expand Up @@ -261,7 +261,7 @@ MKL_VERBOSE SGEMM(T,N,12,10,8,0x7f7f927b1378,0x1bc2140,8,0x1ba8040,8,0x7f7f927b1

<h2 id="6">Enable graph optimization</h2>

Graph optimization by subgraph feature are available in master branch. You can build from source and then use below command to enable this *experimental* feature for better performance:
Graph optimization with subgraph is available and enabled by default in master branch. For MXNet release v1.5, you can manually enable it by:

```
export MXNET_SUBGRAPH_BACKEND=MKLDNN
Expand All @@ -271,7 +271,7 @@ This limitations of this experimental feature are:

- Use this feature only for inference. When training, be sure to turn the feature off by unsetting the `MXNET_SUBGRAPH_BACKEND` environment variable.

- This feature will only run on the CPU, even if you're using a GPU-enabled build of MXNet.
- This feature will only run on the CPU, even if you're using a GPU-enabled build of MXNet.


<h2 id="7">Quantization and Inference with INT8</h2>
Expand Down
29 changes: 6 additions & 23 deletions example/quantization/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,7 @@ python imagenet_gen_qsym_mkldnn.py --model=resnet50_v1 --num-calib-batches=5 --c
The model would be automatically replaced in fusion and quantization format. It is then saved as the quantized symbol and parameter files in the `./model` directory. The following command is to launch inference.

```
# USE MKLDNN AS SUBGRAPH BACKEND
export MXNET_SUBGRAPH_BACKEND=MKLDNN
# Launch FP32 Inference
# Launch FP32 Inference
python imagenet_inference.py --symbol-file=./model/resnet50_v1-symbol.json --param-file=./model/resnet50_v1-0000.params --rgb-mean=123.68,116.779,103.939 --rgb-std=58.393,57.12,57.375 --num-skipped-batches=50 --batch-size=64 --num-inference-batches=500 --dataset=./data/val_256_q90.rec --ctx=cpu
# Launch INT8 Inference
Expand All @@ -74,8 +71,6 @@ python imagenet_gen_qsym_mkldnn.py --model=squeezenet1.0 --num-calib-batches=5 -
The model would be automatically replaced in fusion and quantization format. It is then saved as the quantized symbol and parameter files in the `./model` directory. The following command is to launch inference.

```
# USE MKLDNN AS SUBGRAPH BACKEND
export MXNET_SUBGRAPH_BACKEND=MKLDNN
# Launch FP32 Inference
python imagenet_inference.py --symbol-file=./model/squeezenet1.0-symbol.json --param-file=./model/squeezenet1.0-0000.params --rgb-mean=123.68,116.779,103.939 --rgb-std=58.393,57.12,57.375 --num-skipped-batches=50 --batch-size=64 --num-inference-batches=500 --dataset=./data/val_256_q90.rec --ctx=cpu
Expand All @@ -98,8 +93,6 @@ python imagenet_gen_qsym_mkldnn.py --model=mobilenet1.0 --num-calib-batches=5 --
The model would be automatically replaced in fusion and quantization format. It is then saved as the quantized symbol and parameter files in the `./model` directory. The following command is to launch inference.

```
# USE MKLDNN AS SUBGRAPH BACKEND
export MXNET_SUBGRAPH_BACKEND=MKLDNN
# Launch FP32 Inference
python imagenet_inference.py --symbol-file=./model/mobilenet1.0-symbol.json --param-file=./model/mobilenet1.0-0000.params --rgb-mean=123.68,116.779,103.939 --rgb-std=58.393,57.12,57.375 --num-skipped-batches=50 --batch-size=64 --num-inference-batches=500 --dataset=./data/val_256_q90.rec --ctx=cpu
Expand All @@ -122,8 +115,6 @@ python imagenet_gen_qsym_mkldnn.py --model=mobilenetv2_1.0 --num-calib-batches=5
The model would be automatically replaced in fusion and quantization format. It is then saved as the quantized symbol and parameter files in the `./model` directory. The following command is to launch inference.

```
# USE MKLDNN AS SUBGRAPH BACKEND
export MXNET_SUBGRAPH_BACKEND=MKLDNN
# Launch FP32 Inference
python imagenet_inference.py --symbol-file=./model/mobilenetv2_1.0-symbol.json --param-file=./model/mobilenetv2_1.0-0000.params --rgb-mean=123.68,116.779,103.939 --rgb-std=58.393,57.12,57.375 --num-skipped-batches=50 --batch-size=64 --num-inference-batches=500 --dataset=./data/val_256_q90.rec --ctx=cpu
Expand All @@ -146,8 +137,6 @@ python imagenet_gen_qsym_mkldnn.py --model=inceptionv3 --image-shape=3,299,299 -
The model would be automatically replaced in fusion and quantization format. It is then saved as the quantized symbol and parameter files in the `./model` directory. The following command is to launch inference.

```
# USE MKLDNN AS SUBGRAPH BACKEND
export MXNET_SUBGRAPH_BACKEND=MKLDNN
# Launch FP32 Inference
python imagenet_inference.py --symbol-file=./model/inceptionv3-symbol.json --param-file=./model/inceptionv3-0000.params --image-shape=3,299,299 --rgb-mean=123.68,116.779,103.939 --rgb-std=58.393,57.12,57.375 --num-skipped-batches=50 --batch-size=64 --num-inference-batches=500 --dataset=./data/val_256_q90.rec --ctx=cpu
Expand All @@ -171,10 +160,8 @@ python imagenet_gen_qsym_mkldnn.py --model=imagenet1k-resnet-152 --num-calib-bat
The model would be automatically replaced in fusion and quantization format. It is then saved as the quantized symbol and parameter files in the `./model` directory. The following command is to launch inference.

```
# USE MKLDNN AS SUBGRAPH BACKEND
export MXNET_SUBGRAPH_BACKEND=MKLDNN
# Launch FP32 Inference
# Launch FP32 Inference
python imagenet_inference.py --symbol-file=./model/imagenet1k-resnet-152-symbol.json --param-file=./model/imagenet1k-resnet-152-0000.params --num-skipped-batches=50 --batch-size=64 --num-inference-batches=500 --dataset=./data/val_256_q90.rec --ctx=cpu
# Launch INT8 Inference
Expand All @@ -196,10 +183,8 @@ python imagenet_gen_qsym_mkldnn.py --model=imagenet1k-inception-bn --num-calib-b
The model would be automatically replaced in fusion and quantization format. It is then saved as the quantized symbol and parameter files in the `./model` directory. The following command is to launch inference.

```
# USE MKLDNN AS SUBGRAPH BACKEND
export MXNET_SUBGRAPH_BACKEND=MKLDNN
# Launch FP32 Inference
# Launch FP32 Inference
python imagenet_inference.py --symbol-file=./model/imagenet1k-inception-bn-symbol.json --param-file=./model/imagenet1k-inception-bn-0000.params --rgb-mean=123.68,116.779,103.939 --num-skipped-batches=50 --batch-size=64 --num-inference-batches=500 --dataset=./data/val_256_q90.rec --ctx=cpu
# Launch INT8 Inference
Expand Down Expand Up @@ -240,10 +225,8 @@ Some tips on quantization configs:
2. Then, you should run the following command and verify that your fp32 symbolic model runs inference as expected.

```
# USE MKLDNN AS SUBGRAPH BACKEND
export MXNET_SUBGRAPH_BACKEND=MKLDNN
# Launch FP32 Inference
# Launch FP32 Inference
python imagenet_inference.py --symbol-file=./model/custom-symbol.json --param-file=./model/custom-0000.params --rgb-mean=* --rgb-std=* --num-skipped-batches=* --batch-size=* --num-inference-batches=*--dataset=./data/* --ctx=cpu
```

Expand All @@ -260,7 +243,7 @@ python imagenet_gen_qsym_mkldnn.py --model=custom --num-calib-batches=5 --calib-
6. Finally, you can run INT8 inference:

```
# Launch INT8 Inference
# Launch INT8 Inference
python imagenet_inference.py --symbol-file=./model/*.json --param-file=./model/*.params --rgb-mean=* --rgb-std=* --num-skipped-batches=* --batch-size=* --num-inference-batches=*--dataset=./data/* --ctx=cpu
# Launch dummy data Inference
Expand Down Expand Up @@ -289,6 +272,6 @@ the console to run model quantization for a specific configuration.
- `launch_inference.sh` This is a shell script that calculate the accuracies of all the quantized models generated
by invoking `launch_quantize.sh`.

**NOTE**:
**NOTE**:
- This example has only been tested on Linux systems.
- Performance is expected to decrease with GPU, however the memory footprint of a quantized model is smaller. The purpose of the quantization implementation is to minimize accuracy loss when converting FP32 models to INT8. MXNet community is working on improving the performance.
2 changes: 0 additions & 2 deletions example/ssd/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -234,8 +234,6 @@ python quantization.py
After quantization, INT8 models will be saved in `model/` dictionary. Use the following command to launch inference.

```
# USE MKLDNN AS SUBGRAPH BACKEND
export MXNET_SUBGRAPH_BACKEND=MKLDNN
# Launch FP32 Inference on VOC dataset
python evaluate.py --cpu --num-batch 10 --batch-size 224 --deploy --prefix=./model/ssd_
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -694,8 +694,8 @@ class OperatorSuite extends FunSuite with BeforeAndAfterAll
}

test("maximum") {
val data1 = Symbol.Variable("data")
val data2 = Symbol.Variable("data")
val data1 = Symbol.Variable("data1")
val data2 = Symbol.Variable("data2")
val shape = Shape(3, 4)
val dataTmp1 = Random.uniform(0, 100, shape)
val dataTmp2 = Random.uniform(0, 100, shape)
Expand All @@ -712,8 +712,8 @@ class OperatorSuite extends FunSuite with BeforeAndAfterAll
}

test("minimum") {
val data1 = Symbol.Variable("data")
val data2 = Symbol.Variable("data")
val data1 = Symbol.Variable("data1")
val data2 = Symbol.Variable("data2")
val shape = Shape(3, 4)
val dataTmp1 = Random.uniform(0, 100, shape)
val dataTmp2 = Random.uniform(0, 100, shape)
Expand Down
8 changes: 4 additions & 4 deletions src/c_api/c_api_symbolic.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1035,15 +1035,15 @@ int MXSetCalibTableToQuantizedSymbol(SymbolHandle qsym_handle,
API_END_HANDLE_ERROR(delete s);
}

int MXGenBackendSubgraph(SymbolHandle sym_handle, const char *backend,
int MXGenBackendSubgraph(SymbolHandle sym_handle, const char *backend_name,
SymbolHandle *ret_sym_handle) {
nnvm::Symbol *s = new nnvm::Symbol();
API_BEGIN();
nnvm::Symbol *sym = static_cast<nnvm::Symbol *>(sym_handle);
*s = sym->Copy();
std::vector<mxnet::op::SubgraphPropertyPtr> properties =
mxnet::op::SubgraphPropertyRegistry::Get()->CreateSubgraphProperty(backend);
for (auto property : properties) {
auto backend = mxnet::op::SubgraphBackendRegistry::Get()->GetSubgraphBackend(backend_name);
const auto& subgraph_prop_list = backend->GetSubgraphProperties();
for (auto property : subgraph_prop_list) {
nnvm::Graph g = Symbol2Graph(*s);
property->SetAttr("graph", g);
g.attrs["subgraph_property"] = std::make_shared<nnvm::any>(std::move(property));
Expand Down
8 changes: 5 additions & 3 deletions src/c_api/c_api_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,11 @@ int MXBuildSubgraphByOpNames(SymbolHandle sym_handle,
nnvm::Symbol* sym = static_cast<nnvm::Symbol*>(sym_handle);
*s = sym->Copy();
if (!op_name_set.empty()) {
std::vector<mxnet::op::SubgraphPropertyPtr> properties =
mxnet::op::SubgraphPropertyRegistry::Get()->CreateSubgraphProperty(prop_name);
for (auto property : properties) {
auto& backend =
mxnet::op::SubgraphBackendRegistry::Get()->GetSubgraphBackend(prop_name);
LOG(INFO) << "Subgraph backend " << backend->GetName() << " is activated.";
const auto& subgraph_prop_list = backend->GetSubgraphProperties();
for (auto property : subgraph_prop_list) {
nnvm::Graph g;
g.outputs = s->outputs;
property->SetAttr("graph", g);
Expand Down
Loading

0 comments on commit e98fea3

Please sign in to comment.