-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Silence Sampling Algorithm for ASR Multi-speaker Data Simulator #5897
Merged
stevehuang52
merged 75 commits into
NVIDIA:main
from
stevehuang52:fix_simulator_silence
Feb 12, 2023
Merged
Fix Silence Sampling Algorithm for ASR Multi-speaker Data Simulator #5897
stevehuang52
merged 75 commits into
NVIDIA:main
from
stevehuang52:fix_simulator_silence
Feb 12, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: stevehuang52 <[email protected]>
Signed-off-by: stevehuang52 <[email protected]>
…52/NeMo into fix_simulator_silence
Signed-off-by: stevehuang52 <[email protected]>
Signed-off-by: stevehuang52 <[email protected]>
Signed-off-by: stevehuang52 <[email protected]>
Signed-off-by: stevehuang52 <[email protected]>
Signed-off-by: Taejin Park <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Taejin Park <[email protected]>
Signed-off-by: Taejin Park <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Taejin Park <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Taejin Park <[email protected]>
…52/NeMo into fix_simulator_silence
Signed-off-by: Taejin Park <[email protected]>
Signed-off-by: Taejin Park <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Taejin Park <[email protected]>
Signed-off-by: Taejin Park <[email protected]>
…52/NeMo into fix_simulator_silence
Signed-off-by: Taejin Park <[email protected]>
Signed-off-by: Taejin Park <[email protected]>
…52/NeMo into fix_simulator_silence
Signed-off-by: stevehuang52 <[email protected]>
Signed-off-by: stevehuang52 <[email protected]>
Signed-off-by: Taejin Park <[email protected]>
Signed-off-by: Taejin Park <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: stevehuang52 <[email protected]>
…52/NeMo into fix_simulator_silence
Signed-off-by: Taejin Park <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Taejin Park <[email protected]>
Signed-off-by: Taejin Park <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: stevehuang52 <[email protected]>
Signed-off-by: stevehuang52 <[email protected]>
Signed-off-by: Taejin Park <[email protected]>
Signed-off-by: Taejin Park <[email protected]>
Signed-off-by: Taejin Park <[email protected]>
Notebook is tested, works with no problem. the new script was missing a license template, so I added. I will approve as soon as it passes the test. |
Signed-off-by: stevehuang52 <[email protected]>
…52/NeMo into fix_simulator_silence
Signed-off-by: stevehuang52 <[email protected]>
…52/NeMo into fix_simulator_silence
tango4j
approved these changes
Feb 10, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notebooks and data simulation code both tested. Very nice work by stevehuang, thanks.
titu1994
pushed a commit
to titu1994/NeMo
that referenced
this pull request
Mar 24, 2023
…ulator (NVIDIA#5897) * fix silence insertioon Signed-off-by: stevehuang52 <[email protected]> * update docs and tutorial Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * change to beta annd gamma distributions Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * fix typo Signed-off-by: stevehuang52 <[email protected]> * Added silence vs overlap selector with overlap algo Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Function name change and fixes Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update silence and overlap adding algorithm for better accuracy Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Recommended range for overlap mean Signed-off-by: Taejin Park <[email protected]> * Changing yaml file default values Signed-off-by: Taejin Park <[email protected]> * Fixed typos and errors in docstrings Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed minor bugs and removed unused functions Signed-off-by: Taejin Park <[email protected]> * Fixed minor bugs and removed unused imports Signed-off-by: Taejin Park <[email protected]> * Added docstrings for newly updated overlap algos Signed-off-by: Taejin Park <[email protected]> * Fixed non_silence_len_samples calculation, more accurate now Signed-off-by: Taejin Park <[email protected]> * adding missing docstring for non_silence_len Signed-off-by: Taejin Park <[email protected]> * removed ipdb lines Signed-off-by: Taejin Park <[email protected]> * refactor and update Signed-off-by: stevehuang52 <[email protected]> * updated logs for v1.1 Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Argument check update for mean=0 var=0 case Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix typo Signed-off-by: stevehuang52 <[email protected]> * update silence/overlap mean clipping Signed-off-by: stevehuang52 <[email protected]> * Adding mean clipping Signed-off-by: Taejin Park <[email protected]> * added 0 handling for ovl/sim_mean Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Tested on fisher and fixed the bug with string-speaker ID Signed-off-by: Taejin Park <[email protected]> * update code for visualization Signed-off-by: stevehuang52 <[email protected]> * refactor Signed-off-by: stevehuang52 <[email protected]> * fix load_rttm Signed-off-by: stevehuang52 <[email protected]> * Adding docstrings Signed-off-by: Taejin Park <[email protected]> * Adding usage in the analysis script Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix filename Signed-off-by: stevehuang52 <[email protected]> * Added argument check for sentence length params Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed unnecessary NB torch sampling Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add build_synthetic_vad_manifest.py Signed-off-by: stevehuang52 <[email protected]> * add check for non rttm files Signed-off-by: stevehuang52 <[email protected]> * added docstrings Signed-off-by: Taejin Park <[email protected]> * typo is fixed Signed-off-by: Taejin Park <[email protected]> * License template was missing, added Signed-off-by: Taejin Park <[email protected]> * add missing copyright and move script Signed-off-by: stevehuang52 <[email protected]> * add missing comma Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: Taejin Park <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Signed-off-by: stevehuang52 [email protected]
What does this PR do ?
Replace the previous silence insertion method with a new one that guarantees close approximation to specified
mean_silence
Collection: [ASR]
Adding Silence in ASR Data Simulator
Requirements:
Parameters:
NUM_SESSIONS
: number of sessionsMAX_SESS_DUR
: maximum session durationSAMPLING_RATE
: sampling rate for audio[NB_COUNT, NB_PROB]
: parameters for per sentence duration distributionSILENCE_RATIO_MEAN
: mean for target silence ratio in all sessions, in (0,1)SILENCE_RATIO_VAR
: std for target silence ratio in all sessions, set small values (e.g., 0.1) for better approximation to mean, set larger (e.g., 2.0) for more diversity in silence.PER_SILENCE_VAR
: std for individual silence length, default to 20 for achieving p-value=0.1 to de-correlate speech and silence lengths[PER_SILENCE_MIN,PER_SILENCE_MAX]
: mix and max of per silence duration in seconds, max=-1 for no constraintAlgorithm:
Notes