Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Russian words in Ukrainian files #15

Open
a13 opened this issue Jan 30, 2020 · 3 comments
Open

Russian words in Ukrainian files #15

a13 opened this issue Jan 30, 2020 · 3 comments

Comments

@a13
Copy link

a13 commented Jan 30, 2020

Ukrainian files contain Russian-only (i.e. there are no such words in the Ukrainian language) words

The simplest first-order filter is to ignore words with letters ё, ъ, ы, э

@hermitdave
Copy link
Owner

Let add these with regards to Ukrainian

@hermitdave
Copy link
Owner

I will be creating another dataset in near future and will add this to the Ukrainian language processing

@bicolino34
Copy link

@hermitdave I also found some words with punctuation marks "знав.", "розумію.", "слів.", "зробити."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants