-
Notifications
You must be signed in to change notification settings - Fork 359
Multilingual NLI Tasks #329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: Nathan Habib <[email protected]>
Co-authored-by: Nathan Habib <[email protected]>
bfd34e0
to
7faaa8a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm, feel free to add a bit more doc (for example, the above arxiv links to the corresponding classes). Suite should also be lighteval for now since we're adding them to the core
from lighteval.utils.language import Language | ||
|
||
|
||
# ------------------------------- NLI Tasks ------------------------------- # |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be nice to add just a bit of intro doc at the top of the file to explain what these tasks are overall about (= what is NLI, which datasets are used, etc)
Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
…l into multilnag_nli_tasks
3488e7d
to
7b561fe
Compare
* add multilignaul dynamic generative metrics * draft * finish multichoice config * update tokenizers + install nltk reqs * use punkt tab * Update src/lighteval/utils/imports.py Co-authored-by: Nathan Habib <[email protected]> * Update src/lighteval/metrics/normalizations.py Co-authored-by: Nathan Habib <[email protected]> * fix imports * remove unused import * finish implementation of templates + move stuff around * resolve nits * when in rome do as romans do (handle error messages the same way) * fix utils * nicers tests + fix them * nicer todo * add nice doscrings 📃 * add even more docstring * nit * fix test * add multilingual to dev group * merge nli, add languagees to literals * translation literals * add nli * add rcb + chinese nli * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * add two new tasks + docs --------- Co-authored-by: Nathan Habib <[email protected]> Co-authored-by: Hynek Kydlicek <[email protected]> Co-authored-by: Clémentine Fourrier <[email protected]>
* add multilignaul dynamic generative metrics * draft * finish multichoice config * update tokenizers + install nltk reqs * use punkt tab * Update src/lighteval/utils/imports.py Co-authored-by: Nathan Habib <[email protected]> * Update src/lighteval/metrics/normalizations.py Co-authored-by: Nathan Habib <[email protected]> * fix imports * remove unused import * finish implementation of templates + move stuff around * resolve nits * when in rome do as romans do (handle error messages the same way) * fix utils * nicers tests + fix them * nicer todo * add nice doscrings 📃 * add even more docstring * nit * fix test * add multilingual to dev group * merge nli, add languagees to literals * translation literals * add nli * add rcb + chinese nli * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * Update src/lighteval/tasks/multilingual/tasks.py Co-authored-by: Clémentine Fourrier <[email protected]> * add two new tasks + docs --------- Co-authored-by: Nathan Habib <[email protected]> Co-authored-by: Hynek Kydlicek <[email protected]> Co-authored-by: Clémentine Fourrier <[email protected]>
Goal
Add 3 NLI tasks supporting 26 unique languages.
While the xnli2.0 is superior I decided to keep xnli, as some people might want to use it.
Since it uses template it support all 3 types of formulation out of the box :)
How to test:
where task in
indicnxnli_tel_cf
xnli_en_cf
xnli2.0_en_cf
ocnli_zho_cf
cmnli_zho_cf
rcb_rus_cf
Comments