Skip to content

Conversation

hynky1999
Copy link
Collaborator

@hynky1999 hynky1999 commented Sep 25, 2024

Goal

Add 3 NLI tasks supporting 26 unique languages.

While the xnli2.0 is superior I decided to keep xnli, as some people might want to use it.
Since it uses template it support all 3 types of formulation out of the box :)

How to test:

lighteval accelerate --output_dir=./tmp --custom_tasks="lighteval.tasks.multilingual.tasks" --tasks="custom|{task}|0|0" --model_args=pretrained=gpt2 --override_batch_size=1 --max_samples=100 --save_details

where task in indicnxnli_tel_cf xnli_en_cf xnli2.0_en_cf ocnli_zho_cf cmnli_zho_cf rcb_rus_cf

Comments

  • We talked about removing the suite all together. Since right now we use suite I decided to use custom suite. I can switch to multilingual or whatever

@hynky1999 hynky1999 changed the base branch from main to config_templates September 25, 2024 11:12
Copy link
Member

@clefourrier clefourrier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, feel free to add a bit more doc (for example, the above arxiv links to the corresponding classes). Suite should also be lighteval for now since we're adding them to the core

from lighteval.utils.language import Language


# ------------------------------- NLI Tasks ------------------------------- #
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be nice to add just a bit of intro doc at the top of the file to explain what these tasks are overall about (= what is NLI, which datasets are used, etc)

@hynky1999 hynky1999 changed the base branch from config_templates to main September 30, 2024 17:20
@hynky1999 hynky1999 merged commit 551572a into main Sep 30, 2024
2 checks passed
hynky1999 added a commit that referenced this pull request May 22, 2025
* add multilignaul dynamic generative metrics

* draft

* finish multichoice config

* update tokenizers + install nltk reqs

* use punkt tab

* Update src/lighteval/utils/imports.py

Co-authored-by: Nathan Habib <[email protected]>

* Update src/lighteval/metrics/normalizations.py

Co-authored-by: Nathan Habib <[email protected]>

* fix imports

* remove unused import

* finish implementation of templates + move stuff around

* resolve nits

* when in rome do as romans do (handle error messages the same way)

* fix utils

* nicers tests + fix them

* nicer todo

* add nice doscrings 📃

* add even more docstring

* nit

* fix test

* add multilingual to dev group

* merge nli, add languagees to literals

* translation literals

* add nli

* add rcb + chinese nli

* Update src/lighteval/tasks/multilingual/tasks.py

Co-authored-by: Clémentine Fourrier <[email protected]>

* Update src/lighteval/tasks/multilingual/tasks.py

Co-authored-by: Clémentine Fourrier <[email protected]>

* Update src/lighteval/tasks/multilingual/tasks.py

Co-authored-by: Clémentine Fourrier <[email protected]>

* Update src/lighteval/tasks/multilingual/tasks.py

Co-authored-by: Clémentine Fourrier <[email protected]>

* Update src/lighteval/tasks/multilingual/tasks.py

Co-authored-by: Clémentine Fourrier <[email protected]>

* Update src/lighteval/tasks/multilingual/tasks.py

Co-authored-by: Clémentine Fourrier <[email protected]>

* Update src/lighteval/tasks/multilingual/tasks.py

Co-authored-by: Clémentine Fourrier <[email protected]>

* add two new tasks + docs

---------

Co-authored-by: Nathan Habib <[email protected]>
Co-authored-by: Hynek Kydlicek <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
NathanHB added a commit that referenced this pull request Sep 19, 2025
* add multilignaul dynamic generative metrics

* draft

* finish multichoice config

* update tokenizers + install nltk reqs

* use punkt tab

* Update src/lighteval/utils/imports.py

Co-authored-by: Nathan Habib <[email protected]>

* Update src/lighteval/metrics/normalizations.py

Co-authored-by: Nathan Habib <[email protected]>

* fix imports

* remove unused import

* finish implementation of templates + move stuff around

* resolve nits

* when in rome do as romans do (handle error messages the same way)

* fix utils

* nicers tests + fix them

* nicer todo

* add nice doscrings 📃

* add even more docstring

* nit

* fix test

* add multilingual to dev group

* merge nli, add languagees to literals

* translation literals

* add nli

* add rcb + chinese nli

* Update src/lighteval/tasks/multilingual/tasks.py

Co-authored-by: Clémentine Fourrier <[email protected]>

* Update src/lighteval/tasks/multilingual/tasks.py

Co-authored-by: Clémentine Fourrier <[email protected]>

* Update src/lighteval/tasks/multilingual/tasks.py

Co-authored-by: Clémentine Fourrier <[email protected]>

* Update src/lighteval/tasks/multilingual/tasks.py

Co-authored-by: Clémentine Fourrier <[email protected]>

* Update src/lighteval/tasks/multilingual/tasks.py

Co-authored-by: Clémentine Fourrier <[email protected]>

* Update src/lighteval/tasks/multilingual/tasks.py

Co-authored-by: Clémentine Fourrier <[email protected]>

* Update src/lighteval/tasks/multilingual/tasks.py

Co-authored-by: Clémentine Fourrier <[email protected]>

* add two new tasks + docs

---------

Co-authored-by: Nathan Habib <[email protected]>
Co-authored-by: Hynek Kydlicek <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants