-
Notifications
You must be signed in to change notification settings - Fork 432
Conversation
Does it further improves the performance?
Get Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
From: Tao Lv <[email protected]>
Sent: Tuesday, June 26, 2018 1:54:17 PM
To: dmlc/mshadow
Cc: Xingjian SHI; Mention
Subject: Re: [dmlc/mshadow] fix mkl batch gemm (#343)
@sxjscience<https://github.com/sxjscience>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#343 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AE8D7pBiaGFwTzKQScC2htQLdNL3HVS4ks5uAcyJgaJpZM4U3R5z>.
|
@sxjscience I think performance should be same and I have verified that from mxnet level. The main purpose of this PR is for code refine and reducing some memory usage. |
Yeah. I think it’s a nice improvement.
Get Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
From: Tao Lv <[email protected]>
Sent: Tuesday, June 26, 2018 4:02:51 PM
To: dmlc/mshadow
Cc: Xingjian SHI; Mention
Subject: Re: [dmlc/mshadow] fix mkl batch gemm (#343)
@sxjscience<https://github.com/sxjscience> I think performance should be same and I have verified that from mxnet level. The main purpose of this PR is for code refine and reducing some memory usage.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#343 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AE8D7oYXthl5KKYXlQtGpTLwkARLd1CJks5uAeqrgaJpZM4U3R5z>.
|
This commit may cause perf regression and we are working on it. Please do not merge now. Thanks! |
Talked with @xinyu-intel offline. His performance regression was caused by other changes and not related to this PR. Code change here has passed MXNet operator unit test: https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_operator.py#L2576. I also used the following code to verify the performance change and got similar results before and after this PR:
Is it okay to merge? @sxjscience @piiswrong |
mshadow/dot_engine-inl.h
Outdated
1, p_group_sizeb.data()); | ||
cblas_sgemm_batch(CblasColMajor, p_transa, p_transb, | ||
p_m, p_n, p_k, p_alpha, pp_A.data(), p_lda, pp_B.data(), | ||
p_ldb, p_beta, pp_C.data(), p_ldc, 1, p_group_sizeb); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not simply using &m, &n, &k for p_m, p_n, p_k?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make the code easier to understand. MKL_INT p_m[1] = {m};
means this batched GEMM only has one group and all GEMMs in this group share the same m
value. Maybe in the future, we can extend this to MKL_INT p_m[2] = {m1, m2};
. Then we have two groups in one batched GEMM and the first group has m1
while the second group has m2
. Using &m
in this API will hide this definition and make it a little confusing.
@sxjscience @piiswrong @eric-haibin-lin is this good to merge? |
@TaoLv could you rebase the code and run CI again? |
Is there reference PR in MXNet to test the mshadow change end2end? |
Since the same m/n/k is used for all single gemms, so we can put all these gemms into one group of mkl batch gemm.
@yajiedesign @piiswrong please review again.