Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Torchaudio Vad #3382

Closed
wants to merge 1 commit into from
Closed

Conversation

KubaRad2
Copy link
Contributor

Summary:
The voice activity detector function was unoptimized, confusingly written, and buggy.

The optimizations created here allow for the function to run roughly 17x faster.
The main optimizations were to loop over windows of audio rather than individual audio samples. Reducing the number of copies also helped.

There was an off by one error where the array slice referenced was [1: 16001] (for the default settings) instead of [0: 16000]

Differential Revision: D44749359

@pytorch-bot
Copy link

pytorch-bot bot commented May 26, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/audio/3382

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 2e2b10f:

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44749359

KubaRad2 added a commit to KubaRad2/audio that referenced this pull request May 30, 2023
Summary:
Pull Request resolved: pytorch#3382

The voice activity detector function was unoptimized, confusingly written, and buggy.

The optimizations created here allow for the function to run roughly 17x faster.
The main optimizations were to loop over windows of audio rather than individual audio samples. Reducing the number of copies also helped.

There was an off by one error where the array slice referenced was [1: 16001] (for the default settings) instead of [0: 16000]

Differential Revision: D44749359

fbshipit-source-id: 8b118af7ce854b9332ff0cf12b9a959a8c425199
@KubaRad2 KubaRad2 force-pushed the export-D44749359 branch from cbbe0a1 to b7170b6 Compare May 30, 2023 09:21
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44749359

KubaRad2 added a commit to KubaRad2/audio that referenced this pull request May 30, 2023
Summary:
Pull Request resolved: pytorch#3382

The voice activity detector function was unoptimized, confusingly written, and buggy.

The optimizations created here allow for the function to run roughly 17x faster.
The main optimizations were to loop over windows of audio rather than individual audio samples. Reducing the number of copies also helped.

There was an off by one error where the array slice referenced was [1: 16001] (for the default settings) instead of [0: 16000]

Differential Revision: D44749359

fbshipit-source-id: 025afa644109714cbe8d7bfad467ee2bf2ea18bd
@KubaRad2 KubaRad2 force-pushed the export-D44749359 branch from b7170b6 to 159c7df Compare May 30, 2023 09:31
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44749359

KubaRad2 added a commit to KubaRad2/audio that referenced this pull request May 30, 2023
Summary:
Pull Request resolved: pytorch#3382

The voice activity detector function was unoptimized, confusingly written, and buggy.

The optimizations created here allow for the function to run roughly 17x faster.
The main optimizations were to loop over windows of audio rather than individual audio samples. Reducing the number of copies also helped.

There was an off by one error where the array slice referenced was [1: 16001] (for the default settings) instead of [0: 16000]

Differential Revision: D44749359

fbshipit-source-id: eb3c3ea5abb42040b9021bc22e48c916a1720d4b
@KubaRad2 KubaRad2 force-pushed the export-D44749359 branch from 159c7df to 95dd40c Compare May 30, 2023 09:40
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44749359

@KubaRad2 KubaRad2 force-pushed the export-D44749359 branch from 95dd40c to a88aff5 Compare June 7, 2023 15:30
KubaRad2 added a commit to KubaRad2/audio that referenced this pull request Jun 7, 2023
Summary:
Pull Request resolved: pytorch#3382

The voice activity detector function was unoptimized, confusingly written, and buggy.

The optimizations created here allow for the function to run roughly 17x faster.
The main optimizations were to loop over windows of audio rather than individual audio samples. Reducing the number of copies also helped.

There was an off by one error where the array slice referenced was [1: 16001] (for the default settings) instead of [0: 16000]

Differential Revision: D44749359

fbshipit-source-id: 957340ba3d67b43d17e2605e4f24cee7b36066c3
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44749359

KubaRad2 added a commit to KubaRad2/audio that referenced this pull request Jun 7, 2023
Summary:
Pull Request resolved: pytorch#3382

The voice activity detector function was unoptimized, confusingly written, and buggy.

The optimizations created here allow for the function to run roughly 17x faster.
The main optimizations were to loop over windows of audio rather than individual audio samples. Reducing the number of copies also helped.

There was an off by one error where the array slice referenced was [1: 16001] (for the default settings) instead of [0: 16000]

Differential Revision: D44749359

fbshipit-source-id: 44c5e69d0b56d25fb012879fc09c4305d4720be1
@KubaRad2 KubaRad2 force-pushed the export-D44749359 branch from a88aff5 to a33d297 Compare June 7, 2023 15:38
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44749359

KubaRad2 added a commit to KubaRad2/audio that referenced this pull request Jun 7, 2023
Summary:
Pull Request resolved: pytorch#3382

The voice activity detector function was unoptimized, confusingly written, and buggy.

The optimizations created here allow for the function to run roughly 17x faster.
The main optimizations were to loop over windows of audio rather than individual audio samples. Reducing the number of copies also helped.

There was an off by one error where the array slice referenced was [1: 16001] (for the default settings) instead of [0: 16000]

Differential Revision: D44749359

fbshipit-source-id: a9d2e3f0fe679ff4185b87c470a7dc172379fc0e
@KubaRad2 KubaRad2 force-pushed the export-D44749359 branch from a33d297 to 8cb8acc Compare June 7, 2023 15:46
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44749359

4 similar comments
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44749359

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44749359

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44749359

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44749359

KubaRad2 added a commit to KubaRad2/audio that referenced this pull request Jun 7, 2023
Summary:
Pull Request resolved: pytorch#3382

The voice activity detector function was unoptimized, confusingly written, and buggy.

The optimizations created here allow for the function to run roughly 17x faster.
The main optimizations were to loop over windows of audio rather than individual audio samples. Reducing the number of copies also helped.

There was an off by one error where the array slice referenced was [1: 16001] (for the default settings) instead of [0: 16000]

Differential Revision: D44749359

fbshipit-source-id: 6b9b4839a843d506ba400455b2112db9893f4e8f
@KubaRad2 KubaRad2 force-pushed the export-D44749359 branch from b9b7b08 to 048aab4 Compare June 7, 2023 18:14
KubaRad2 added a commit to KubaRad2/audio that referenced this pull request Jun 7, 2023
Summary:
Pull Request resolved: pytorch#3382

The voice activity detector function was unoptimized, confusingly written, and buggy.

The optimizations created here allow for the function to run roughly 17x faster.
The main optimizations were to loop over windows of audio rather than individual audio samples. Reducing the number of copies also helped.

There was an off by one error where the array slice referenced was [1: 16001] (for the default settings) instead of [0: 16000]

Differential Revision: D44749359

fbshipit-source-id: b157a060521421b57e0066d9847898908a0df467
@KubaRad2 KubaRad2 force-pushed the export-D44749359 branch from 8cb8acc to b9b7b08 Compare June 7, 2023 18:15
KubaRad2 added a commit to KubaRad2/audio that referenced this pull request Jun 7, 2023
Summary:
Pull Request resolved: pytorch#3382

The voice activity detector function was unoptimized, confusingly written, and buggy.

The optimizations created here allow for the function to run roughly 17x faster.
The main optimizations were to loop over windows of audio rather than individual audio samples. Reducing the number of copies also helped.

There was an off by one error where the array slice referenced was [1: 16001] (for the default settings) instead of [0: 16000]

Differential Revision: D44749359

fbshipit-source-id: 952be1ab64419bc667d55c2649c271ad2f9abd9b
@KubaRad2 KubaRad2 force-pushed the export-D44749359 branch from 048aab4 to 51ce3b6 Compare June 7, 2023 18:21
KubaRad2 added a commit to KubaRad2/audio that referenced this pull request Jun 7, 2023
Summary:
Pull Request resolved: pytorch#3382

The voice activity detector function was unoptimized, confusingly written, and buggy.

The optimizations created here allow for the function to run roughly 17x faster.
The main optimizations were to loop over windows of audio rather than individual audio samples. Reducing the number of copies also helped.

There was an off by one error where the array slice referenced was [1: 16001] (for the default settings) instead of [0: 16000]

Differential Revision: D44749359

fbshipit-source-id: 2925bb3eecbcd0f6be8d524fdf16e51fca096d86
@KubaRad2 KubaRad2 force-pushed the export-D44749359 branch from 51ce3b6 to 673405b Compare June 7, 2023 18:26
Summary:
Pull Request resolved: pytorch#3382

The voice activity detector function was unoptimized, confusingly written, and buggy.

The optimizations created here allow for the function to run roughly 17x faster.
The main optimizations were to loop over windows of audio rather than individual audio samples. Reducing the number of copies also helped.

There was an off by one error where the array slice referenced was [1: 16001] (for the default settings) instead of [0: 16000]

Differential Revision: D44749359

fbshipit-source-id: 21f6de116dd74807e899763af4571f85bded5095
@KubaRad2 KubaRad2 force-pushed the export-D44749359 branch from 673405b to 2e2b10f Compare June 7, 2023 19:31
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44749359

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 1e117f5.

@github-actions
Copy link

github-actions bot commented Jun 8, 2023

Hey @None.
You merged this PR, but labels were not properly added. Please add a primary and secondary label (See https://github.com/pytorch/audio/blob/main/.github/process_commit.py).


Some guidance:

Use 'module: ops' for operations under 'torchaudio/{transforms, functional}', and ML-related components under 'torchaudio/csrc' (e.g. RNN-T loss).

Things in "examples" directory:

  • 'recipe' is applicable to training recipes under the 'examples' folder,
  • 'tutorial' is applicable to tutorials under the “examples/tutorials” folder
  • 'example' is applicable to everything else (e.g. C++ examples)
  • 'module: docs' is applicable to code documentations (not to tutorials).

Regarding examples in code documentations, please also use 'module: docs'.

Please use 'other' tag only when you’re sure the changes are not much relevant to users, or when all other tags are not applicable. Try not to use it often, in order to minimize efforts required when we prepare release notes.


When preparing release notes, please make sure 'documentation' and 'tutorials' occur as the last sub-categories under each primary category like 'new feature', 'improvements' or 'prototype'.

Things related to build are by default excluded from the release note, except when it impacts users. For example:
* Drop support of Python 3.7.
* Add support of Python 3.X.
* Change the way a third party library is bound (so that user needs to install it separately).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants