Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[not a bug?] MeCab can't be used for word splitting and pronunciation unless the language is named exactly "Japanese" #103

Closed
alk0 opened this issue Mar 18, 2023 · 3 comments
Labels
ux User Experience could be better

Comments

@alk0
Copy link

alk0 commented Mar 18, 2023

Describe the bug

It is possible (and may be desirable) to create a duplicate copy of the same L2 (study) language - e.g. one Japanese-English pair with spaces and one with MeCab parsing (my case), or Japanese-English and Japanese-(other language) or whatever. The problem is - MeCab is available only when the name of the language is exactly "Japanese". For word splitting the option to select a method disappears the second the name is changed, for pronunciation it just doesn't work anymore. Maybe I'm doing something wrong, but this is how it looks to me.

To Reproduce

  1. Go to Languages, create a new "Japanese" language (or use an existing one)
  2. Change the name in Study Language "L2": field to "Japanese1" or whatever
  3. Selecting RegExp Word Characters: method becomes unavailable immediately; MeCab-generated pronunciation (the one for Romaniz.: field) stops working.

Expected behavior

MeCab is still available when the name of the language is changed.

Proposal for a fix

As a quick fix: maybe it would be the easiest to check if the language name starts with "Japanese" or contains it instead of equals to?

@HugoFara HugoFara added the ux User Experience could be better label Mar 22, 2023
@HugoFara
Copy link
Owner

Hi!

Thanks for reaching me, honestly I knew the code behavior would cause some issues, but I did not have straightforward solution at the time I made it. Simply put, the issue is "how should LWT know that the studied language is Japanese?". For the language selection field I'm using language name, but some other parts of the code use dictionary links, etc...

As of today, the only way to have things properly work is to have your "Japanese" language set with "mecab" as regex. You can set any name for Japanese with a generic regex parser.

I haven't decided on a long-term solution yet, I hope my suggestions will be enough for now. Don't hesitate to contact me again if it's still q blocking issue!

@alk0
Copy link
Author

alk0 commented Mar 22, 2023

You can set any name for Japanese with a generic regex parser.

Yes, it solves half of the problem, but MeCab-generated spelling is still unavailable in that case, it's rather inconvenient :(

OK, I'll try to come up with some temporary fix for myself maybe. Or maybe I'll just live with it.

@HugoFara HugoFara reopened this Apr 7, 2023
@HugoFara
Copy link
Owner

HugoFara commented Apr 7, 2023

Hi! I just pushed a new commit that, though not perfect, may help you with your issues. Let's sum it up.

Japanese pronunciation

It is now activated if your parser is "mecab" and (language name is "Japanese" or the translator URL has "ja" as language source). "ja" is Japanese language code, for instance setting https://translate.google.com/?ie=UTF-8&sl=ja&tl=en&text=lwt_term as the translator URL will work.

Japanese parser

Here I think things are not inconvenient. You can always set the regex parser to "mecab" independently from the language name, so that parsing will work. I don't really think a complex move from dev-side is required here 😆

Don't hesitate to tell me if your issues persists!

@HugoFara HugoFara closed this as completed Apr 7, 2023
This was referenced Dec 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ux User Experience could be better
Projects
None yet
Development

No branches or pull requests

2 participants