Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some language codes not recognized by iso639.Language.match() #498

Closed
sonofthomp opened this issue Jul 14, 2023 · 4 comments
Closed

Some language codes not recognized by iso639.Language.match() #498

sonofthomp opened this issue Jul 14, 2023 · 4 comments

Comments

@sonofthomp
Copy link
Contributor

Running codes.py yielded the following error:

(base) gabrielthompson@Gabriels-MBP-2 lib % python codes.py 
codes.py WARNING: WikiPron resolves the key 'ain' to 'Ainu (Japan)' listed as 'Ainu' on Wiktionary
codes.py WARNING: WikiPron resolves the key 'rup' to 'Macedo-Romanian' listed as 'Aromanian' on Wiktionary
codes.py WARNING: WikiPron resolves the key 'bjb' to 'Banggarla' listed as 'Barngarla' on Wiktionary
Traceback (most recent call last):
  File "/Users/gabrielthompson/Desktop/Coding/research/wikipron3/data/scrape/lib/codes.py", line 215, in <module>
    main()
  File "/Users/gabrielthompson/Desktop/Coding/research/wikipron3/data/scrape/lib/codes.py", line 177, in main
    iso639_lang = iso639.Language.match(wiktionary_code)
  File "/Users/gabrielthompson/anaconda3/lib/python3.10/site-packages/iso639/language.py", line 120, in match
    return _get_language(user_input, query_order)
  File "/Users/gabrielthompson/anaconda3/lib/python3.10/site-packages/iso639/language.py", line 189, in _get_language
    raise LanguageNotFoundError(
iso639.language.LanguageNotFoundError: 'gmw-cfr' isn't an ISO language code or name

For whatever reason, the iso639 module isn't recognizing some of the language codes from the wiktionary API. Someone should look into why this is, or maybe omit languages that don't have valid language codes.

@kylebgorman
Copy link
Collaborator

I hadn't seen that fatal exception before. I think we should probably catch it and convert it to a warning. What do you think?

@sonofthomp
Copy link
Contributor Author

Giving a warning sounds like a good idea. In place of iso639_lang = iso639.Language.match(wiktionary_code), I'm thinking something like:

try:
    iso639_lang = iso639.Language.match(wiktionary_code)
except iso639.language.LanguageNotFoundError:
    logging.warning(
        "Could not find language with code %s", wiktionary_code
    )

... so that in the case of gmw-cfr, the following is outputted:

codes.py WARNING: Could not find language with code gmw-cfr

@kylebgorman
Copy link
Collaborator

codes.py WARNING: Could not find language with code gmw-cfr

This proposal LGTM.

sonofthomp pushed a commit to sonofthomp/wikipron that referenced this issue Jul 17, 2023
Fixed issue in [issue CUNY-CL#498](CUNY-CL#498)

The iso639 module occasionally doesn't recognizing ISO language codes. Updated the code to catch this error and display a warning.
sonofthomp pushed a commit to sonofthomp/wikipron that referenced this issue Jul 17, 2023
kylebgorman pushed a commit that referenced this issue Jul 17, 2023
* Catch iso639.language.LanguageNotFoundError

Fixed issue in [issue #498](#498)

The iso639 module occasionally doesn't recognizing ISO language codes. Updated the code to catch this error and display a warning.

* Added 498, moved 497

* Fixed checks

* Continue in case of invalid language
@kylebgorman
Copy link
Collaborator

Closed in #499, I believe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants