Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Found one missing Chinese character. #4

Open
zydjohnHotmail opened this issue Jun 12, 2022 · 5 comments
Open

Found one missing Chinese character. #4

zydjohnHotmail opened this issue Jun 12, 2022 · 5 comments

Comments

@zydjohnHotmail
Copy link

Hello:
I used your repo., basically, it works well.
However, there is at least one tranditional chinese character missing in the library.
See the following picture, the first character is missing.
Its unicode seems to be: \u8F17
Please check.
Thanks,
UFOChinese

@CosineG
Copy link
Owner

CosineG commented Jun 13, 2022

Do you mean missing 𫐐 <-> 輗?I will test it later.

@zydjohnHotmail
Copy link
Author

Hello:
I don't know this Chinese character. This is a name for a horse.
But whatever you can add it in your library, it will be OK.

@CosineG
Copy link
Owner

CosineG commented Jun 26, 2022

Sorry for taking so long to reply. I tested and found out that the Simplified 𫐐 of the character 輗 is too rare and located in Unicode CJK Extension C, so many fonts don't contain it and jieba.NET can't parse it correctly. Although this pair of conversions exists in the dictionary now, it doesn't work properly either.

@zydjohnHotmail
Copy link
Author

Do you have any solution?

@CosineG
Copy link
Owner

CosineG commented Jun 26, 2022

Maybe you can raise an issue to anderscui/jieba.NET? And I don't know if the font on your computer supports the character.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants