Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TOPI] Update softmax compute and CPU schedule #3680

Merged
merged 7 commits into from
Aug 5, 2019

Conversation

soiferj
Copy link
Contributor

@soiferj soiferj commented Jul 31, 2019

This change improves performance for softmax by simplifying the computation and writing a schedule that supports better parallelization.

Compute: Currently, exp(input - max) is computed twice: once in the _compute_expsum stage and once in the _normalize stage. This change adds an extra stage to compute this tensor once. It is then re-used in the _compute_expsum and _normalize stages.

Schedule: Currently, the schedule only parallelizes the _normalize stage of the computation. This change puts all stages of computation under a common root and parallelizes the outer dimensions.

The following results are with a tensor of shape (1,12,128,128) and axis=-1. This simulates the softmax in BERT base. The CPU is Intel Xeon E5-2650, and the Relay target string is llvm -mcpu=core-avx2.

TVM_NUM_THREADS Latency in ms (master branch) Latency in ms (new branch)
1 4.7 3.0
2 3.8 1.8
4 3.3 1.0
8 3.1 0.74
16 3.2 0.55

@soiferj
Copy link
Contributor Author

soiferj commented Jul 31, 2019

@kevinthesun @vinx13 can you please review and add any other reviewers you think are necessary?

I am currently modifying log_softmax, and it seems worthwhile to create a new generic schedule for it, since the inputs to tvm.compute are now different for softmax and log_softmax. What do you think?

@tqchen
Copy link
Member

tqchen commented Aug 1, 2019

Thank you @soiferj , can you check the CI problem?

@soiferj
Copy link
Contributor Author

soiferj commented Aug 1, 2019

Yeah, I'm taking a look at the CI failure now. It seems to be an issue in the CUDA schedule. I will work on it.

@soiferj
Copy link
Contributor Author

soiferj commented Aug 2, 2019

The CI issue is fixed.

Copy link
Contributor

@kevinthesun kevinthesun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@tqchen
Copy link
Member

tqchen commented Aug 3, 2019

@kevinthesun feel free to merge the PR given you are managing it

@kevinthesun kevinthesun merged commit ee74d00 into apache:master Aug 5, 2019
@kevinthesun
Copy link
Contributor

Thank you for contributing!

@soiferj soiferj deleted the soiferj/softmaxupdate branch August 5, 2019 04:40
wweic pushed a commit to wweic/tvm that referenced this pull request Aug 9, 2019
* Update Softmax compute and CPU schedule

* Add C++ compute

* Fix schedule

* Update CUDA and OpenGL schedules

* Fix log_softmax

* Fix hls and opengl schedules

* Fix CUDA schedule
wweic pushed a commit to neo-ai/tvm that referenced this pull request Sep 6, 2019
* Update Softmax compute and CPU schedule

* Add C++ compute

* Fix schedule

* Update CUDA and OpenGL schedules

* Fix log_softmax

* Fix hls and opengl schedules

* Fix CUDA schedule
@anijain2305
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants