-
Notifications
You must be signed in to change notification settings - Fork 358
Multilingual Reading Comprehension tasks #333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Co-authored-by: Nathan Habib <[email protected]>
Co-authored-by: Nathan Habib <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
…l into multilnag_nli_tasks
Co-authored-by: Clémentine Fourrier <[email protected]>
…laswag preprocesisng
hynky1999
added a commit
that referenced
this pull request
May 22, 2025
* add multilignaul dynamic generative metrics * draft * finish multichoice config * update tokenizers + install nltk reqs * use punkt tab * Update src/lighteval/utils/imports.py Co-authored-by: Nathan Habib <[email protected]> * Update src/lighteval/metrics/normalizations.py Co-authored-by: Nathan Habib <[email protected]> * fix imports * remove unused import * finish implementation of templates + move stuff around * resolve nits * when in rome do as romans do (handle error messages the same way) * fix utils * nicers tests + fix them * nicer todo * add nice doscrings 📃 * add even more docstring * nit * fix test * add multilingual to dev group * merge nli, add languagees to literals * translation literals * add nli * add copa tasks + fix tranlation literals * add hellaswag tasks * remove custom telgu hellaswag * remove hindi hellaswag * add rc tasks + small nits * add rcb + chinese nli * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * add two new tasks + docs * add nice docs * update hellaswag with docs * move hellaswag to lighteval suite * add desc to tasks * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * enable returning none from templates + better typing * change unoficial hellaswag names to have community_prefix + unify hellaswag preprocesisng * let strip be optional in hellaswag * nits * add comment * update the datasets after changing ownership --------- Co-authored-by: Nathan Habib <[email protected]> Co-authored-by: Hynek Kydlicek <[email protected]> Co-authored-by: Clémentine Fourrier <[email protected]>
NathanHB
added a commit
that referenced
this pull request
Sep 19, 2025
* add multilignaul dynamic generative metrics * draft * finish multichoice config * update tokenizers + install nltk reqs * use punkt tab * Update src/lighteval/utils/imports.py Co-authored-by: Nathan Habib <[email protected]> * Update src/lighteval/metrics/normalizations.py Co-authored-by: Nathan Habib <[email protected]> * fix imports * remove unused import * finish implementation of templates + move stuff around * resolve nits * when in rome do as romans do (handle error messages the same way) * fix utils * nicers tests + fix them * nicer todo * add nice doscrings 📃 * add even more docstring * nit * fix test * add multilingual to dev group * merge nli, add languagees to literals * translation literals * add nli * add copa tasks + fix tranlation literals * add hellaswag tasks * remove custom telgu hellaswag * remove hindi hellaswag * add rc tasks + small nits * add rcb + chinese nli * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * add two new tasks + docs * add nice docs * update hellaswag with docs * move hellaswag to lighteval suite * add desc to tasks * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * enable returning none from templates + better typing * change unoficial hellaswag names to have community_prefix + unify hellaswag preprocesisng * let strip be optional in hellaswag * nits * add comment * update the datasets after changing ownership --------- Co-authored-by: Nathan Habib <[email protected]> Co-authored-by: Hynek Kydlicek <[email protected]> Co-authored-by: Clémentine Fourrier <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Goal
Add RC tasks supporting about 130 unique languages/scripts.
How to test:
where task in
xquad_eng|0|0
thaiqa_tha|0|0
sber_squad_rus|0|0
arcd_ara|0|0
kenswquad_swa|0|0
chinese_squad_zho|0|0
cmrc2018_zho|0|0
indicqa_hin|0|0
fquadv2_fra|0|0
tquadv2_tur|0|0
tydiqa_eng|0|0
belebele_acm_Arab|0|0
Comments