Use scaled_dot_product_attention in Wav2vec2/HuBERT's SelfAttention #3253

nateanl · 2023-04-07T19:57:18Z

Replace the attention computation with torch.nn.functional.scaled_dot_product_attention to improve running efficiency.

nateanl · 2023-04-07T20:00:06Z

Here are the benchmark results with new changes. The benchmark script can be found in https://gist.github.com/nateanl/97b2f9adb39c05a4e854fbd924de01f6.

SelfAttention
[--------------- Wav2Vec2 benchmark ----------------]
                                 |    CPU    |  CUDA
1 threads: -----------------------------------------
      wav2vec2_base 5 seconds    |   1508.9  |   9.2
      wav2vec2_base 10 seconds   |   3198.6  |  19.9
      wav2vec2_base 15 seconds   |   5323.6  |  28.0
      wav2vec2_base 20 seconds   |   7605.2  |  35.6
      wav2vec2_large 5 seconds   |   4247.2  |  20.3
      wav2vec2_large 10 seconds  |   9066.1  |  43.3
      wav2vec2_large 15 seconds  |  14862.5  |  62.6
      wav2vec2_large 20 seconds  |  21632.6  |  82.5
Times are in milliseconds (ms).


scaled_dot_product_attention
[--------------- Wav2Vec2 benchmark ----------------]
                                 |    CPU    |  CUDA
1 threads: -----------------------------------------
      wav2vec2_base 5 seconds    |   1430.9  |   8.3
      wav2vec2_base 10 seconds   |   3047.9  |  18.8
      wav2vec2_base 15 seconds   |   5097.1  |  25.8
      wav2vec2_base 20 seconds   |   7121.4  |  32.3
      wav2vec2_large 5 seconds   |   4091.0  |  19.2
      wav2vec2_large 10 seconds  |   8651.0  |  40.3
      wav2vec2_large 15 seconds  |  14594.4  |  57.2
      wav2vec2_large 20 seconds  |  20186.2  |  72.5
Times are in milliseconds (ms).
"""

facebook-github-bot · 2023-04-07T20:01:21Z

@nateanl has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

nateanl · 2023-04-08T20:35:00Z

Here is the script which shows the scaled_dot_product_attention achieves identical outputs as SelfAttention.

mthrok · 2023-04-10T21:21:50Z

Does this work with quantization? One of the reason here that I did not use PyTorch native MHA is it does not support quantization.

nateanl · 2023-04-10T22:48:55Z

Does this work with quantization?

I think so. There is a quantization unit test and it has passed.

facebook-github-bot · 2023-04-10T23:16:22Z

@nateanl merged this pull request in 94cc4bd.

github-actions · 2023-04-10T23:16:36Z

Hey @nateanl.
You merged this PR, but labels were not properly added. Please add a primary and secondary label (See https://github.com/pytorch/audio/blob/main/.github/process_commit.py)

…ytorch#3253) Summary: Replace the attention computation with `torch.nn.functional.scaled_dot_product_attention` to improve running efficiency. Pull Request resolved: pytorch#3253 Reviewed By: mthrok Differential Revision: D44800353 Pulled By: nateanl fbshipit-source-id: 41550d868c809099aadbe812b0ebe2c38121efb8

…3253) (#3261) Summary: Replace the attention computation with `torch.nn.functional.scaled_dot_product_attention` to improve running efficiency. Pull Request resolved: #3253 Reviewed By: mthrok Differential Revision: D44800353 Pulled By: nateanl fbshipit-source-id: 41550d868c809099aadbe812b0ebe2c38121efb8

use scaled_dot_product_attention in SelfAttention

efbb195

facebook-github-bot added the CLA Signed label Apr 7, 2023

facebook-github-bot closed this in 94cc4bd Apr 10, 2023

facebook-github-bot added the Merged label Apr 10, 2023

nateanl mentioned this pull request Apr 10, 2023

[Cherry-pick] Use scaled_dot_product_attention in Wav2vec2/HuBERT's SelfAttention (#3253) #3261

Merged

nateanl mentioned this pull request Apr 11, 2023

[v2.0.1] Release Tracker #3237

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use scaled_dot_product_attention in Wav2vec2/HuBERT's SelfAttention #3253

Use scaled_dot_product_attention in Wav2vec2/HuBERT's SelfAttention #3253

nateanl commented Apr 7, 2023

nateanl commented Apr 7, 2023

facebook-github-bot commented Apr 7, 2023

nateanl commented Apr 8, 2023

mthrok commented Apr 10, 2023

nateanl commented Apr 10, 2023

facebook-github-bot commented Apr 10, 2023

github-actions bot commented Apr 10, 2023

Use scaled_dot_product_attention in Wav2vec2/HuBERT's SelfAttention #3253

Use scaled_dot_product_attention in Wav2vec2/HuBERT's SelfAttention #3253

Conversation

nateanl commented Apr 7, 2023

nateanl commented Apr 7, 2023

facebook-github-bot commented Apr 7, 2023

nateanl commented Apr 8, 2023

mthrok commented Apr 10, 2023

nateanl commented Apr 10, 2023

facebook-github-bot commented Apr 10, 2023

github-actions bot commented Apr 10, 2023