Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ranking merge #3906

Merged
merged 27 commits into from
Apr 11, 2022
Merged

Ranking merge #3906

merged 27 commits into from
Apr 11, 2022

Conversation

yzhang123
Copy link
Contributor

@yzhang123 yzhang123 commented Mar 30, 2022

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Collection: [Note which collection this PR will affect]

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)
  • ci tests are passing
  • SH tests are passing
  • with LM matches baseline numbers

@yzhang123 yzhang123 requested a review from ekmb March 30, 2022 00:38
Signed-off-by: Yang Zhang <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Mar 30, 2022

This pull request introduces 5 alerts when merging 8423be4 into ff91628 - view on LGTM.com

new alerts:

  • 4 for Unused import
  • 1 for Wrong name for an argument in a class instantiation

@lgtm-com
Copy link

lgtm-com bot commented Mar 30, 2022

This pull request introduces 5 alerts when merging ab5cab3 into ff91628 - view on LGTM.com

new alerts:

  • 4 for Unused import
  • 1 for Wrong name for an argument in a class instantiation

@lgtm-com
Copy link

lgtm-com bot commented Mar 30, 2022

This pull request introduces 5 alerts when merging 4f778e8 into f68c924 - view on LGTM.com

new alerts:

  • 4 for Unused import
  • 1 for Wrong name for an argument in a class instantiation

@lgtm-com
Copy link

lgtm-com bot commented Mar 30, 2022

This pull request introduces 5 alerts when merging 8d6c449 into 60f4c6c - view on LGTM.com

new alerts:

  • 4 for Unused import
  • 1 for Wrong name for an argument in a class instantiation

@lgtm-com
Copy link

lgtm-com bot commented Mar 31, 2022

This pull request introduces 4 alerts when merging be732af into ca8a7e0 - view on LGTM.com

new alerts:

  • 4 for Unused import



class RangeFst(GraphFst):
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dosctings

@lgtm-com
Copy link

lgtm-com bot commented Mar 31, 2022

This pull request introduces 1 alert when merging 7b19452 into b1b6e5e - view on LGTM.com

new alerts:

  • 1 for Unused local variable

Signed-off-by: Yang Zhang <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Mar 31, 2022

This pull request introduces 1 alert when merging be8192a into 5c88c8d - view on LGTM.com

new alerts:

  • 1 for Unused local variable

@lgtm-com
Copy link

lgtm-com bot commented Apr 4, 2022

This pull request introduces 1 alert when merging d3d8437 into 087de54 - view on LGTM.com

new alerts:

  • 1 for Unused import

@yzhang123 yzhang123 marked this pull request as ready for review April 5, 2022 22:37
@@ -129,32 +132,80 @@ def normalize(self, text: str, n_tagged: int, punct_post_process: bool = True, v
len(text.split()) < 500
), "Your input is too long. Please split up the input into sentences, or strings with fewer than 500 words"
original_text = text

if self.lang == "en":
if self.lang in ["en", "de"]:
Copy link
Contributor Author

@yzhang123 yzhang123 Apr 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ekmb only needed for english or can this be applied to all langauges?

@@ -0,0 +1,20 @@
`female.tsv` - List of common female names. Copyright (c) January 1991 by Mark Kantrowitz, 4987 names, Version 1.3 (29-MAR-94)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this file type be added to pip package?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's needed to build the roman graph

roman_dict = load_labels(get_abs_path("data/roman/roman_to_spoken.tsv"))
default_graph = pynini.string_map(roman_dict).optimize()
default_graph = pynutil.insert("integer: \"") + default_graph + pynutil.insert("\"")
graph_teens = pynini.string_map([x[0] for x in roman_dict[:19]]).optimize()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ekmb shall we relax this too for audio based without lm?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there any confusing cases like abbreviations?

Signed-off-by: Yang Zhang <[email protected]>
@@ -0,0 +1,20 @@
`female.tsv` - List of common female names. Copyright (c) January 1991 by Mark Kantrowitz, 4987 names, Version 1.3 (29-MAR-94)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's needed to build the roman graph

roman_dict = load_labels(get_abs_path("data/roman/roman_to_spoken.tsv"))
default_graph = pynini.string_map(roman_dict).optimize()
default_graph = pynutil.insert("integer: \"") + default_graph + pynutil.insert("\"")
graph_teens = pynini.string_map([x[0] for x in roman_dict[:19]]).optimize()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there any confusing cases like abbreviations?

@lgtm-com
Copy link

lgtm-com bot commented Apr 8, 2022

This pull request introduces 2 alerts when merging 668eafc into afc7b71 - view on LGTM.com

new alerts:

  • 2 for Unused import

ekmb
ekmb previously approved these changes Apr 8, 2022
@ekmb ekmb marked this pull request as draft April 8, 2022 03:20
@lgtm-com
Copy link

lgtm-com bot commented Apr 8, 2022

This pull request introduces 2 alerts when merging 715f95c into c7a5a33 - view on LGTM.com

new alerts:

  • 2 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Apr 9, 2022

This pull request introduces 2 alerts when merging 89e5dff into c7a5a33 - view on LGTM.com

new alerts:

  • 2 for Unused import

@ekmb ekmb marked this pull request as ready for review April 11, 2022 19:18
@lgtm-com
Copy link

lgtm-com bot commented Apr 11, 2022

This pull request introduces 2 alerts when merging 1702604 into 4aba4b2 - view on LGTM.com

new alerts:

  • 2 for Unused import

@ekmb ekmb merged commit 1d9ee1c into main Apr 11, 2022
@ekmb ekmb deleted the ranking_merge branch April 11, 2022 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants