-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
t5-rpe-fix targeting r1.10.0; raise exception for PP>2. #4469
Conversation
Signed-off-by: Hoo Chang Shin <[email protected]>
if type(hidden_states) is tuple: | ||
if len(hidden_states) == 2: | ||
hidden_states, position_bias = hidden_states | ||
elif len(hidden_states) == 3: | ||
hidden_states, position_bias, encoder_decoder_position_bias = hidden_states | ||
else: | ||
raise IndexError('Hidden_states needs to be tuple containing 2 or 3 elements.') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you delete this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think he moved it further down
if type(x_) is tuple:
if len(x_) == 2:
x_, position_bias = x_
elif len(x_) == 3:
x_, position_bias, encoder_decoder_position_bias = x_
else:
raise IndexError('Hidden_states (x_) needs to be tuple containing 2 or 3 elements.')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@@ -498,7 +500,7 @@ def compute_bias(self, query_length, key_length): | |||
relative_position = memory_position - context_position # shape (query_length, key_length) | |||
relative_position_bucket = self._relative_position_bucket( | |||
relative_position, # shape (query_length, key_length) | |||
bidirectional=(self.layer_type != LayerType.decoder), # (not self.is_decoder), | |||
bidirectional=(self.attention_type != AttnMaskType.causal), # self.is_decoder and self_attention. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change this comment? I think should be encoder + self-attention?
if type(hidden_states) is tuple: | ||
if len(hidden_states) == 2: | ||
hidden_states, position_bias = hidden_states | ||
elif len(hidden_states) == 3: | ||
hidden_states, position_bias, encoder_decoder_position_bias = hidden_states | ||
else: | ||
raise IndexError('Hidden_states needs to be tuple containing 2 or 3 elements.') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think he moved it further down
if type(x_) is tuple:
if len(x_) == 2:
x_, position_bias = x_
elif len(x_) == 3:
x_, position_bias, encoder_decoder_position_bias = x_
else:
raise IndexError('Hidden_states (x_) needs to be tuple containing 2 or 3 elements.')
Signed-off-by: Hoo Chang Shin <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]>
* update branch Signed-off-by: ericharper <[email protected]> * Fix ASR Typos in tutorials (#4384) * Fix typos Signed-off-by: smajumdar <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> (cherry picked from commit 0322b15) Co-authored-by: Travis Bartley <[email protected]> * Fix tutorial typos and docs (#4415) * Fix typos Signed-off-by: smajumdar <[email protected]> * Fix typos Signed-off-by: smajumdar <[email protected]> * Add ASR Scores to Docs (#4412) * Fix link Signed-off-by: smajumdar <[email protected]> * Correct model card Signed-off-by: smajumdar <[email protected]> * Add ASR Results to Docs Signed-off-by: smajumdar <[email protected]> * Update info Signed-off-by: smajumdar <[email protected]> * Update info Signed-off-by: smajumdar <[email protected]> * docs: add table overflow handling for nested sections (#4441) Co-authored-by: Nick Goncharenko <[email protected]> * Docs: Decrease Font Size on Tables (#4444) * docs: add table overflow handling for nested sections * docs: set table font-size to small Co-authored-by: Nick Goncharenko <[email protected]> * Updated notebook to fix batch configuration and precision bugs (#4447) * Updated notebook to fix batch configuration and precision bugs Signed-off-by: Virginia Adams <[email protected]> * Deleted cell outputs Signed-off-by: Virginia Adams <[email protected]> * Set datasets back to full dataset Signed-off-by: Virginia Adams <[email protected]> Co-authored-by: Eric Harper <[email protected]> * fix branch in link (#4454) Signed-off-by: ekmb <[email protected]> * [TTS] [bugfix] German FastPitch HiFi-GAN tutorial and lr (#4459) * [TN] Bug fix: expand serial coverage of unknown symbol, remove constraints from word graph (#4463) * remove constraints from word graph det Signed-off-by: ekmb <[email protected]> * add measure units to serial Signed-off-by: ekmb <[email protected]> * revert serial changes, update jenkins path Signed-off-by: ekmb <[email protected]> * fix test case Signed-off-by: ekmb <[email protected]> * update indentation (#4468) Signed-off-by: Akshit Arora <[email protected]> * t5-rpe-fix targeting r1.10.0; raise exception for PP>2. (#4469) Signed-off-by: Hoo Chang Shin <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> * Fix some 's' cases for IPA G2P (#4460) Signed-off-by: Jocelyn Huang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Refactor bias act fusion (#4376) * Refactor bias act fusion Signed-off-by: MaximumEntropy <[email protected]> * Update NMT config Signed-off-by: MaximumEntropy <[email protected]> * Update ci tests Signed-off-by: MaximumEntropy <[email protected]> * Empty Signed-off-by: MaximumEntropy <[email protected]> * Add kwargs to exact string match (#4479) Signed-off-by: MaximumEntropy <[email protected]> * Try fix (#4484) Signed-off-by: MaximumEntropy <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Co-authored-by: Nick Goncharenko <[email protected]> Co-authored-by: Nick Goncharenko <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Akshit Arora <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> Co-authored-by: Jocelyn <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]>
* update branch Signed-off-by: ericharper <[email protected]> * Fix ASR Typos in tutorials (#4384) * Fix typos Signed-off-by: smajumdar <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> (cherry picked from commit 0322b15) Co-authored-by: Travis Bartley <[email protected]> * Fix tutorial typos and docs (#4415) * Fix typos Signed-off-by: smajumdar <[email protected]> * Fix typos Signed-off-by: smajumdar <[email protected]> * Add ASR Scores to Docs (#4412) * Fix link Signed-off-by: smajumdar <[email protected]> * Correct model card Signed-off-by: smajumdar <[email protected]> * Add ASR Results to Docs Signed-off-by: smajumdar <[email protected]> * Update info Signed-off-by: smajumdar <[email protected]> * Update info Signed-off-by: smajumdar <[email protected]> * docs: add table overflow handling for nested sections (#4441) Co-authored-by: Nick Goncharenko <[email protected]> * Docs: Decrease Font Size on Tables (#4444) * docs: add table overflow handling for nested sections * docs: set table font-size to small Co-authored-by: Nick Goncharenko <[email protected]> * Updated notebook to fix batch configuration and precision bugs (#4447) * Updated notebook to fix batch configuration and precision bugs Signed-off-by: Virginia Adams <[email protected]> * Deleted cell outputs Signed-off-by: Virginia Adams <[email protected]> * Set datasets back to full dataset Signed-off-by: Virginia Adams <[email protected]> Co-authored-by: Eric Harper <[email protected]> * fix branch in link (#4454) Signed-off-by: ekmb <[email protected]> * [TTS] [bugfix] German FastPitch HiFi-GAN tutorial and lr (#4459) * [TN] Bug fix: expand serial coverage of unknown symbol, remove constraints from word graph (#4463) * remove constraints from word graph det Signed-off-by: ekmb <[email protected]> * add measure units to serial Signed-off-by: ekmb <[email protected]> * revert serial changes, update jenkins path Signed-off-by: ekmb <[email protected]> * fix test case Signed-off-by: ekmb <[email protected]> * update indentation (#4468) Signed-off-by: Akshit Arora <[email protected]> * t5-rpe-fix targeting r1.10.0; raise exception for PP>2. (#4469) Signed-off-by: Hoo Chang Shin <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> * Fix some 's' cases for IPA G2P (#4460) Signed-off-by: Jocelyn Huang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Refactor bias act fusion (#4376) * Refactor bias act fusion Signed-off-by: MaximumEntropy <[email protected]> * Update NMT config Signed-off-by: MaximumEntropy <[email protected]> * Update ci tests Signed-off-by: MaximumEntropy <[email protected]> * Empty Signed-off-by: MaximumEntropy <[email protected]> * Add kwargs to exact string match (#4479) Signed-off-by: MaximumEntropy <[email protected]> * Try fix (#4484) Signed-off-by: MaximumEntropy <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Co-authored-by: Nick Goncharenko <[email protected]> Co-authored-by: Nick Goncharenko <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Akshit Arora <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> Co-authored-by: Jocelyn <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]>
* update branch Signed-off-by: ericharper <[email protected]> * Fix ASR Typos in tutorials (#4384) * Fix typos Signed-off-by: smajumdar <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> (cherry picked from commit 0322b15) Co-authored-by: Travis Bartley <[email protected]> * Fix tutorial typos and docs (#4415) * Fix typos Signed-off-by: smajumdar <[email protected]> * Fix typos Signed-off-by: smajumdar <[email protected]> * Add ASR Scores to Docs (#4412) * Fix link Signed-off-by: smajumdar <[email protected]> * Correct model card Signed-off-by: smajumdar <[email protected]> * Add ASR Results to Docs Signed-off-by: smajumdar <[email protected]> * Update info Signed-off-by: smajumdar <[email protected]> * Update info Signed-off-by: smajumdar <[email protected]> * docs: add table overflow handling for nested sections (#4441) Co-authored-by: Nick Goncharenko <[email protected]> * Docs: Decrease Font Size on Tables (#4444) * docs: add table overflow handling for nested sections * docs: set table font-size to small Co-authored-by: Nick Goncharenko <[email protected]> * Updated notebook to fix batch configuration and precision bugs (#4447) * Updated notebook to fix batch configuration and precision bugs Signed-off-by: Virginia Adams <[email protected]> * Deleted cell outputs Signed-off-by: Virginia Adams <[email protected]> * Set datasets back to full dataset Signed-off-by: Virginia Adams <[email protected]> Co-authored-by: Eric Harper <[email protected]> * fix branch in link (#4454) Signed-off-by: ekmb <[email protected]> * [TTS] [bugfix] German FastPitch HiFi-GAN tutorial and lr (#4459) * [TN] Bug fix: expand serial coverage of unknown symbol, remove constraints from word graph (#4463) * remove constraints from word graph det Signed-off-by: ekmb <[email protected]> * add measure units to serial Signed-off-by: ekmb <[email protected]> * revert serial changes, update jenkins path Signed-off-by: ekmb <[email protected]> * fix test case Signed-off-by: ekmb <[email protected]> * update indentation (#4468) Signed-off-by: Akshit Arora <[email protected]> * t5-rpe-fix targeting r1.10.0; raise exception for PP>2. (#4469) Signed-off-by: Hoo Chang Shin <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> * Fix some 's' cases for IPA G2P (#4460) Signed-off-by: Jocelyn Huang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Refactor bias act fusion (#4376) * Refactor bias act fusion Signed-off-by: MaximumEntropy <[email protected]> * Update NMT config Signed-off-by: MaximumEntropy <[email protected]> * Update ci tests Signed-off-by: MaximumEntropy <[email protected]> * Empty Signed-off-by: MaximumEntropy <[email protected]> * Add kwargs to exact string match (#4479) Signed-off-by: MaximumEntropy <[email protected]> * Try fix (#4484) Signed-off-by: MaximumEntropy <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Co-authored-by: Nick Goncharenko <[email protected]> Co-authored-by: Nick Goncharenko <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Akshit Arora <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> Co-authored-by: Jocelyn <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: arendu <[email protected]>
* update branch Signed-off-by: ericharper <[email protected]> * Fix ASR Typos in tutorials (NVIDIA#4384) * Fix typos Signed-off-by: smajumdar <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (NVIDIA#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> (cherry picked from commit 0322b15) Co-authored-by: Travis Bartley <[email protected]> * Fix tutorial typos and docs (NVIDIA#4415) * Fix typos Signed-off-by: smajumdar <[email protected]> * Fix typos Signed-off-by: smajumdar <[email protected]> * Add ASR Scores to Docs (NVIDIA#4412) * Fix link Signed-off-by: smajumdar <[email protected]> * Correct model card Signed-off-by: smajumdar <[email protected]> * Add ASR Results to Docs Signed-off-by: smajumdar <[email protected]> * Update info Signed-off-by: smajumdar <[email protected]> * Update info Signed-off-by: smajumdar <[email protected]> * docs: add table overflow handling for nested sections (NVIDIA#4441) Co-authored-by: Nick Goncharenko <[email protected]> * Docs: Decrease Font Size on Tables (NVIDIA#4444) * docs: add table overflow handling for nested sections * docs: set table font-size to small Co-authored-by: Nick Goncharenko <[email protected]> * Updated notebook to fix batch configuration and precision bugs (NVIDIA#4447) * Updated notebook to fix batch configuration and precision bugs Signed-off-by: Virginia Adams <[email protected]> * Deleted cell outputs Signed-off-by: Virginia Adams <[email protected]> * Set datasets back to full dataset Signed-off-by: Virginia Adams <[email protected]> Co-authored-by: Eric Harper <[email protected]> * fix branch in link (NVIDIA#4454) Signed-off-by: ekmb <[email protected]> * [TTS] [bugfix] German FastPitch HiFi-GAN tutorial and lr (NVIDIA#4459) * [TN] Bug fix: expand serial coverage of unknown symbol, remove constraints from word graph (NVIDIA#4463) * remove constraints from word graph det Signed-off-by: ekmb <[email protected]> * add measure units to serial Signed-off-by: ekmb <[email protected]> * revert serial changes, update jenkins path Signed-off-by: ekmb <[email protected]> * fix test case Signed-off-by: ekmb <[email protected]> * update indentation (NVIDIA#4468) Signed-off-by: Akshit Arora <[email protected]> * t5-rpe-fix targeting r1.10.0; raise exception for PP>2. (NVIDIA#4469) Signed-off-by: Hoo Chang Shin <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> * Fix some 's' cases for IPA G2P (NVIDIA#4460) Signed-off-by: Jocelyn Huang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Refactor bias act fusion (NVIDIA#4376) * Refactor bias act fusion Signed-off-by: MaximumEntropy <[email protected]> * Update NMT config Signed-off-by: MaximumEntropy <[email protected]> * Update ci tests Signed-off-by: MaximumEntropy <[email protected]> * Empty Signed-off-by: MaximumEntropy <[email protected]> * Add kwargs to exact string match (NVIDIA#4479) Signed-off-by: MaximumEntropy <[email protected]> * Try fix (NVIDIA#4484) Signed-off-by: MaximumEntropy <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Co-authored-by: Nick Goncharenko <[email protected]> Co-authored-by: Nick Goncharenko <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Akshit Arora <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> Co-authored-by: Jocelyn <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: David Mosallanezhad <[email protected]>
* update branch Signed-off-by: ericharper <[email protected]> * Fix ASR Typos in tutorials (NVIDIA#4384) * Fix typos Signed-off-by: smajumdar <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (NVIDIA#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> (cherry picked from commit 0322b15) Co-authored-by: Travis Bartley <[email protected]> * Fix tutorial typos and docs (NVIDIA#4415) * Fix typos Signed-off-by: smajumdar <[email protected]> * Fix typos Signed-off-by: smajumdar <[email protected]> * Add ASR Scores to Docs (NVIDIA#4412) * Fix link Signed-off-by: smajumdar <[email protected]> * Correct model card Signed-off-by: smajumdar <[email protected]> * Add ASR Results to Docs Signed-off-by: smajumdar <[email protected]> * Update info Signed-off-by: smajumdar <[email protected]> * Update info Signed-off-by: smajumdar <[email protected]> * docs: add table overflow handling for nested sections (NVIDIA#4441) Co-authored-by: Nick Goncharenko <[email protected]> * Docs: Decrease Font Size on Tables (NVIDIA#4444) * docs: add table overflow handling for nested sections * docs: set table font-size to small Co-authored-by: Nick Goncharenko <[email protected]> * Updated notebook to fix batch configuration and precision bugs (NVIDIA#4447) * Updated notebook to fix batch configuration and precision bugs Signed-off-by: Virginia Adams <[email protected]> * Deleted cell outputs Signed-off-by: Virginia Adams <[email protected]> * Set datasets back to full dataset Signed-off-by: Virginia Adams <[email protected]> Co-authored-by: Eric Harper <[email protected]> * fix branch in link (NVIDIA#4454) Signed-off-by: ekmb <[email protected]> * [TTS] [bugfix] German FastPitch HiFi-GAN tutorial and lr (NVIDIA#4459) * [TN] Bug fix: expand serial coverage of unknown symbol, remove constraints from word graph (NVIDIA#4463) * remove constraints from word graph det Signed-off-by: ekmb <[email protected]> * add measure units to serial Signed-off-by: ekmb <[email protected]> * revert serial changes, update jenkins path Signed-off-by: ekmb <[email protected]> * fix test case Signed-off-by: ekmb <[email protected]> * update indentation (NVIDIA#4468) Signed-off-by: Akshit Arora <[email protected]> * t5-rpe-fix targeting r1.10.0; raise exception for PP>2. (NVIDIA#4469) Signed-off-by: Hoo Chang Shin <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> * Fix some 's' cases for IPA G2P (NVIDIA#4460) Signed-off-by: Jocelyn Huang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Refactor bias act fusion (NVIDIA#4376) * Refactor bias act fusion Signed-off-by: MaximumEntropy <[email protected]> * Update NMT config Signed-off-by: MaximumEntropy <[email protected]> * Update ci tests Signed-off-by: MaximumEntropy <[email protected]> * Empty Signed-off-by: MaximumEntropy <[email protected]> * Add kwargs to exact string match (NVIDIA#4479) Signed-off-by: MaximumEntropy <[email protected]> * Try fix (NVIDIA#4484) Signed-off-by: MaximumEntropy <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Co-authored-by: Nick Goncharenko <[email protected]> Co-authored-by: Nick Goncharenko <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Akshit Arora <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> Co-authored-by: Jocelyn <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Hainan Xu <[email protected]>
What does this PR do ?
Fix the RPE implementation converging slower with loss consistently higher than NeMo APE and Megatron-LM RPE implementation.
NEW: exception is raised when configured with pipelie-parallel-size > 2 and RPE.
Collection: NLP
Changelog
Usage
model.position_embedding_type='relative'
config.Before your PR is "Ready for review"
Pre checks:
PR Type:
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information