MKL-DNN LBR-GRU Inference Integration (FP32 LBR-GRU) #15741

zixuanweeei · 2019-08-03T00:17:56Z

Description

Reopen #15621 here. We integrated the mkl-dnn Linear-Before-Reset GRU into MXNet. Currently, it supports FP32 inference. Please take some reviews on this PR.@ciyongch @TaoLv @pengzhao-intel

Performance

We tested the performance of FusedRNN with mode='gru' using the same dimension as that in PR#14713, i.e. seq_length = 300, batch_size = 20, input_size = 800, hidden_size = 800.

mode	Layer	Direction	MXNET_USE_MKLDNN_RNN=0		MXNET_USE_MKLDNN_RNN=1		SpeedUp
mode	Layer	Direction	Throughput (samples/sec)	Latency (ms)	Throughput (samples/sec)	Latency (ms)	Throughtput	Latency
gru	1	1	430.03	20.43	806.27	4.28	1.87	4.78
gru	1	2	218.58	119.50	416.55	8.58	1.91	13.93
gru	5	1	89.47	100.07	177.52	21.20	1.98	4.72
gru	5	2	39.68	611.38	71.15	46.45	1.79	13.16

We also compared the performance of this PR with that of the previously integrated LSTM, vRNN tanh, vRNN Relu on branch master. It seems that there is a distinct regression with mode='lstm'.

Mode	Layer	Direction	`a26af2b`		This PR ( `cfc6910` )		Gap
Mode	Layer	Direction	Throughput (samples/sec)	Latency (ms)	Throughput (samples/sec)	Latency (ms)	Throughput	Latency
lstm	1	1	630.78	4.82	670.23	4.87	1.06	0.99
lstm	1	2	313.71	9.68	338.51	9.72	1.08	1.00
lstm	5	1	139.85	23.59	138.22	23.48	0.99	1.00
lstm	5	2	54.63	51.19	54.27	51.28	0.99	1.00
rnn_tanh	1	1	1573.45	2.44	1576.23	2.51	1.00	0.97
rnn_tanh	1	2	836.43	4.63	830.33	4.67	0.99	0.99
rnn_tanh	5	1	381.32	11.44	379.88	11.50	1.00	1.00
rnn_tanh	5	2	159.76	24.92	149.86	24.90	0.94	1.00
rnn_relu	1	1	1536.55	2.65	1540.29	2.75	1.00	0.96
rnn_relu	1	2	805.00	5.09	807.68	5.06	1.00	1.01
rnn_relu	5	1	373.27	12.41	377.79	12.32	1.01	1.01
rnn_relu	5	2	154.21	26.93	153.80	26.61	1.00	1.01

pengzhao-intel · 2019-08-03T23:37:26Z

what's the reason to open a new PR instead of the previous one?

zixuanweeei · 2019-08-04T02:26:25Z

@pengzhao-intel I incorrectly used git rebase, which introduced all the changes on apache/incubator-mxnet/master since last merged commits into that branch. So I cut off a new branch.

pengzhao-intel · 2019-08-04T04:16:44Z

@pengzhao-intel I incorrectly used git rebase, which introduced all the changes on apache/incubator-mxnet/master since last merged commits into that branch. So I cut off a new branch.

Thanks for the explanation.

pengzhao-intel · 2019-08-04T13:51:59Z

Do all comments in the original thread are resolved?

pengzhao-intel

LGTM and will merge tomorrow if there are no other comments.

zixuanweeei · 2019-08-04T14:15:28Z

Do all comments in the original thread are resolved?

Yes, all comments are resolved.

@TaoLv Could you check on this PR for LBR-GRU integration again? Specifically, the type of input params of GetMKLDNNRNNCacheMemorySize is changed to size_t,
https://github.com/apache/incubator-mxnet/blob/fd1e21443eaad240210079b0fbebd5c062f06663/src/operator/nn/mkldnn/mkldnn_rnn_impl.h#L138-L144

and using reference or pointer to access a memory.

https://github.com/apache/incubator-mxnet/blob/fd1e21443eaad240210079b0fbebd5c062f06663/src/operator/nn/mkldnn/mkldnn_rnn_impl.h#L304

https://github.com/apache/incubator-mxnet/blob/fd1e21443eaad240210079b0fbebd5c062f06663/src/operator/nn/mkldnn/mkldnn_rnn_impl.h#L472

https://github.com/apache/incubator-mxnet/blob/fd1e21443eaad240210079b0fbebd5c062f06663/src/operator/nn/mkldnn/mkldnn_rnn_impl.h#L490-L491

ciyongch · 2019-08-05T01:49:41Z