-
Notifications
You must be signed in to change notification settings - Fork 73
Reorganization of cg
#396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Reorganization of cg
#396
Changes from 16 commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
36af387
Adds subdirectories 'lib' and 'phones' to 'data/phones'. Moves '.phon…
ajmalanoski 6f673f9
Updates 'data/phones/HOWTO.md' to reflect new locations of files
ajmalanoski 528cb1e
Fixes file path in 'data/phones/HOWTO.md'
ajmalanoski 6f8ea4c
Updates outdated components of 'data/phones/HOWTO.md'
ajmalanoski 2d98838
Updates path to phones in 'tests/test_data/test_summary.py'
ajmalanoski afdb768
Updates changelog
ajmalanoski a6b079c
Merge branch 'master' into reorganize
ajmalanoski ba0839a
Renames 'data/cg' to 'data/covering_grammar'
ajmalanoski 8f34b19
Renames covering grammar files to include script info and and transcr…
ajmalanoski e9ab803
Moves covering grammar/error analysis scripts to 'data/covering_gramm…
ajmalanoski 353d8f8
Adds placeholder README for 'data/covering_grammar/'
ajmalanoski c33739c
Adds script to make input files for error analysis
ajmalanoski abd10d0
Removes superfluous comments from 'data/covering_grammar/lib/make_tes…
ajmalanoski d52822a
Updates changelog
ajmalanoski f7eefef
Merge branch 'master' of https://github.com/kylebgorman/wikipron into…
ajmalanoski d0b26ed
Makes (most of the) suggested edits to data/covering_grammar/lib/make…
ajmalanoski 6c164bc
Adds logging config to data/covering_grammar/lib/make_test_file.py. M…
ajmalanoski 0176fe7
Minor style fix
ajmalanoski File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
(TEMPORARY) |
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
#!/usr/bin/env python | ||
"""Makes test file. | ||
|
||
Takes the gold data and the model output, and creates a three-column TSV where | ||
each line has a word, its gold pronunciation, and the predicted pronunciation. | ||
Assumes that the input files have the same words in the same order. | ||
""" | ||
|
||
import argparse | ||
import logging | ||
|
||
|
||
def main(args: argparse.Namespace) -> None: | ||
with open(args.gold, "r") as gf, open(args.pred, "r") as pf: | ||
with open(args.out, "w") as wf: | ||
for lineno, (g_line, p_line) in enumerate(zip(gf, pf), 1): | ||
g_word, g_pron = g_line.split("\t") | ||
p_word, p_pron = p_line.split("\t") | ||
# Make sure that gold data and predictions have the | ||
# same words. | ||
if not g_word == p_word: | ||
ajmalanoski marked this conversation as resolved.
Show resolved
Hide resolved
|
||
logging.warning( | ||
"%s != %s (line %d)", g_word, p_word, lineno | ||
) | ||
continue | ||
# Note that we use `strip` to remove the newline. | ||
g_pron = g_pron.strip() | ||
ajmalanoski marked this conversation as resolved.
Show resolved
Hide resolved
|
||
p_pron = p_pron.strip() | ||
line = f"{g_word}\t{g_pron}\t{p_pron}" | ||
ajmalanoski marked this conversation as resolved.
Show resolved
Hide resolved
|
||
print(line, file=wf) | ||
|
||
|
||
if __name__ == "__main__": | ||
parser = argparse.ArgumentParser(description=__doc__) | ||
kylebgorman marked this conversation as resolved.
Show resolved
Hide resolved
|
||
parser.add_argument( | ||
"gold", help="TSV with words and correct pronunciations" | ||
) | ||
parser.add_argument( | ||
"pred", help="TSV with words and predicted pronunciations" | ||
) | ||
parser.add_argument("out", help="file to write to") | ||
main(parser.parse_args()) |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this is a plus here or not but when you're doing a lot of file-opening, contextlib's
ExitStack
helps a lot: https://docs.python.org/3/library/contextlib.html#contextlib.ExitStack