[TOPI] Update softmax compute and CPU schedule #3680

soiferj · 2019-07-31T20:54:48Z

This change improves performance for softmax by simplifying the computation and writing a schedule that supports better parallelization.

Compute: Currently, exp(input - max) is computed twice: once in the _compute_expsum stage and once in the _normalize stage. This change adds an extra stage to compute this tensor once. It is then re-used in the _compute_expsum and _normalize stages.

Schedule: Currently, the schedule only parallelizes the _normalize stage of the computation. This change puts all stages of computation under a common root and parallelizes the outer dimensions.

The following results are with a tensor of shape (1,12,128,128) and axis=-1. This simulates the softmax in BERT base. The CPU is Intel Xeon E5-2650, and the Relay target string is llvm -mcpu=core-avx2.

TVM_NUM_THREADS	Latency in ms (master branch)	Latency in ms (new branch)
1	4.7	3.0
2	3.8	1.8
4	3.3	1.0
8	3.1	0.74
16	3.2	0.55

soiferj · 2019-07-31T20:58:17Z

@kevinthesun @vinx13 can you please review and add any other reviewers you think are necessary?

I am currently modifying log_softmax, and it seems worthwhile to create a new generic schedule for it, since the inputs to tvm.compute are now different for softmax and log_softmax. What do you think?

tqchen · 2019-08-01T20:01:50Z

Thank you @soiferj , can you check the CI problem?

soiferj · 2019-08-01T23:55:43Z

Yeah, I'm taking a look at the CI failure now. It seems to be an issue in the CUDA schedule. I will work on it.

soiferj · 2019-08-02T02:47:52Z

The CI issue is fixed.

kevinthesun

lgtm

tqchen · 2019-08-03T23:56:13Z

@kevinthesun feel free to merge the PR given you are managing it

kevinthesun · 2019-08-05T02:46:43Z

Thank you for contributing!

* Update Softmax compute and CPU schedule * Add C++ compute * Fix schedule * Update CUDA and OpenGL schedules * Fix log_softmax * Fix hls and opengl schedules * Fix CUDA schedule

anijain2305 · 2020-02-13T19:54:50Z

Another suggestion - https://discuss.tvm.ai/t/softmax-sequence-of-relay-ops/5686

Update Softmax compute and CPU schedule

587cade

jonso4 added 3 commits July 31, 2019 16:36

Add C++ compute

8b775e2

Fix schedule

7ed4953

Update CUDA and OpenGL schedules

376f19e

tqchen added the status: need review label Aug 1, 2019

tqchen assigned kevinthesun Aug 1, 2019

Fix log_softmax

e91bfb4

jonso4 added 2 commits August 1, 2019 17:22

Fix hls and opengl schedules

931fd52

Fix CUDA schedule

c43dfbd

kevinthesun approved these changes Aug 2, 2019

View reviewed changes

kevinthesun merged commit ee74d00 into apache:master Aug 5, 2019

soiferj deleted the soiferj/softmaxupdate branch August 5, 2019 04:40

tqchen added status: accepted and removed status: need review labels Aug 5, 2019

tqchen mentioned this pull request Nov 8, 2019

[RELEASE][DRAFT] TVM v0.6 Release candidate #4259

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TOPI] Update softmax compute and CPU schedule #3680

[TOPI] Update softmax compute and CPU schedule #3680

soiferj commented Jul 31, 2019

soiferj commented Jul 31, 2019 •

edited

Loading

tqchen commented Aug 1, 2019

soiferj commented Aug 1, 2019

soiferj commented Aug 2, 2019

kevinthesun left a comment

tqchen commented Aug 3, 2019

kevinthesun commented Aug 5, 2019

anijain2305 commented Feb 13, 2020

[TOPI] Update softmax compute and CPU schedule #3680

[TOPI] Update softmax compute and CPU schedule #3680

Conversation

soiferj commented Jul 31, 2019

soiferj commented Jul 31, 2019 • edited Loading

tqchen commented Aug 1, 2019

soiferj commented Aug 1, 2019

soiferj commented Aug 2, 2019

kevinthesun left a comment

Choose a reason for hiding this comment

tqchen commented Aug 3, 2019

kevinthesun commented Aug 5, 2019

anijain2305 commented Feb 13, 2020

soiferj commented Jul 31, 2019 •

edited

Loading