fix mkl batch gemm #343

TaoLv · 2018-06-26T05:45:43Z

Since the same m/n/k is used for all single gemms, so we can put all these gemms into one group of mkl batch gemm.

@yajiedesign @piiswrong please review again.

TaoLv · 2018-06-26T05:54:12Z

sxjscience · 2018-06-26T05:56:41Z

Does it further improves the performance? Get Outlook for iOS<https://aka.ms/o0ukef>

________________________________ From: Tao Lv <[email protected]> Sent: Tuesday, June 26, 2018 1:54:17 PM To: dmlc/mshadow Cc: Xingjian SHI; Mention Subject: Re: [dmlc/mshadow] fix mkl batch gemm (#343) @sxjscience<https://github.com/sxjscience> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#343 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AE8D7pBiaGFwTzKQScC2htQLdNL3HVS4ks5uAcyJgaJpZM4U3R5z>.

TaoLv · 2018-06-26T08:02:48Z

@sxjscience I think performance should be same and I have verified that from mxnet level. The main purpose of this PR is for code refine and reducing some memory usage.

sxjscience · 2018-06-26T08:12:17Z

Yeah. I think it’s a nice improvement. Get Outlook for iOS<https://aka.ms/o0ukef>

…

________________________________ From: Tao Lv <[email protected]> Sent: Tuesday, June 26, 2018 4:02:51 PM To: dmlc/mshadow Cc: Xingjian SHI; Mention Subject: Re: [dmlc/mshadow] fix mkl batch gemm (#343) @sxjscience<https://github.com/sxjscience> I think performance should be same and I have verified that from mxnet level. The main purpose of this PR is for code refine and reducing some memory usage. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#343 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AE8D7oYXthl5KKYXlQtGpTLwkARLd1CJks5uAeqrgaJpZM4U3R5z>.

xinyu-intel · 2018-07-07T01:02:41Z

This commit may cause perf regression and we are working on it. Please do not merge now. Thanks！

…ch-gemm

TaoLv · 2018-07-18T04:28:46Z

Talked with @xinyu-intel offline. His performance regression was caused by other changes and not related to this PR. Code change here has passed MXNet operator unit test: https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_operator.py#L2576.

I also used the following code to verify the performance change and got similar results before and after this PR:

import mxnet as mx
import numpy as np
import time

x = mx.sym.Variable('x')
y = mx.sym.Variable('y')
input_x = np.random.rand(200, 256, 256)
input_y = np.random.rand(200, 256, 512)

sym = mx.symbol.batch_dot(x, y).bind(mx.cpu(), {'x': mx.nd.array(input_x), 'y': mx.nd.array(input_y)})

start = time.time()
for i in range(1010):
    if i == 10:
        start = time.time()
    sym.forward(is_train=False)[0].asnumpy()

print((time.time() - start))

Is it okay to merge? @sxjscience @piiswrong

sxjscience · 2018-06-26T08:18:41Z

mshadow/dot_engine-inl.h

-                      1, p_group_sizeb.data());
+  cblas_sgemm_batch(CblasColMajor, p_transa, p_transb,
+                    p_m, p_n, p_k, p_alpha, pp_A.data(), p_lda, pp_B.data(),
+                    p_ldb, p_beta, pp_C.data(), p_ldc, 1, p_group_sizeb);


Why not simply using &m, &n, &k for p_m, p_n, p_k?

To make the code easier to understand. MKL_INT p_m[1] = {m}; means this batched GEMM only has one group and all GEMMs in this group share the same m value. Maybe in the future, we can extend this to MKL_INT p_m[2] = {m1, m2};. Then we have two groups in one batched GEMM and the first group has m1 while the second group has m2. Using &m in this API will hide this definition and make it a little confusing.

…ch-gemm

TaoLv · 2018-12-12T03:04:19Z

@sxjscience @piiswrong @eric-haibin-lin is this good to merge?

pengzhao-intel · 2019-03-06T13:55:38Z

@TaoLv could you rebase the code and run CI again?

eric-haibin-lin · 2019-03-06T19:36:40Z

Is there reference PR in MXNet to test the mshadow change end2end?

fix batch_gemm

16d0d50

Merge branch 'master' of https://github.com/dmlc/mshadow into fix-bat…

56011e6

…ch-gemm

sxjscience approved these changes Jul 18, 2018

View reviewed changes

TaoLv added 3 commits September 28, 2018 09:50

Merge branch 'master' of https://github.com/dmlc/mshadow into fix-bat…

5ed3c96

…ch-gemm

Merge branch 'master' of https://github.com/dmlc/mshadow into fix-bat…

19121dc

…ch-gemm

const int GROUP_SIZE = 1;

b3b4c31

TaoLv mentioned this pull request Dec 14, 2018

[WIP] Change MKL BLAS linkage from LP64 to ILP64 #365

Closed

Merge branch 'master' into fix-batch-gemm

828dd7f

Merge branch 'master' into fix-batch-gemm

7fb0d86

szha merged commit c9d2f01 into dmlc:master Mar 6, 2019

TaoLv mentioned this pull request Mar 11, 2019

Add missing semicolon #370

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix mkl batch gemm #343

fix mkl batch gemm #343

TaoLv commented Jun 26, 2018

TaoLv commented Jun 26, 2018

sxjscience commented Jun 26, 2018 via email

TaoLv commented Jun 26, 2018

sxjscience commented Jun 26, 2018 via email

xinyu-intel commented Jul 7, 2018

TaoLv commented Jul 18, 2018

sxjscience Jun 26, 2018

TaoLv Jul 18, 2018

TaoLv commented Dec 12, 2018

pengzhao-intel commented Mar 6, 2019

eric-haibin-lin commented Mar 6, 2019

fix mkl batch gemm #343

fix mkl batch gemm #343

Conversation

TaoLv commented Jun 26, 2018

TaoLv commented Jun 26, 2018

sxjscience commented Jun 26, 2018 via email

TaoLv commented Jun 26, 2018

sxjscience commented Jun 26, 2018 via email

xinyu-intel commented Jul 7, 2018

TaoLv commented Jul 18, 2018

sxjscience Jun 26, 2018

Choose a reason for hiding this comment

TaoLv Jul 18, 2018

Choose a reason for hiding this comment

TaoLv commented Dec 12, 2018

pengzhao-intel commented Mar 6, 2019

eric-haibin-lin commented Mar 6, 2019