Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate Common Voice sentence cleaning rules #257

Open
Tracked by #216
marco-c opened this issue Nov 9, 2023 · 0 comments
Open
Tracked by #216

Investigate Common Voice sentence cleaning rules #257

marco-c opened this issue Nov 9, 2023 · 0 comments
Labels
quality Improving robustness and translation quality

Comments

@marco-c
Copy link
Collaborator

marco-c commented Nov 9, 2023

Maybe we could use some of them to clean our datasets.

They live in https://github.com/common-voice/cv-sentence-extractor/tree/main/src/rules.

@gregtatum gregtatum added the quality Improving robustness and translation quality label Dec 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
quality Improving robustness and translation quality
Projects
None yet
Development

No branches or pull requests

2 participants