Skip to content

Conversation

alielfilali01
Copy link
Contributor

AlGhafa benchmarking suite, consist of 11 dataset presented in this paper and hosted in this repo in the Hub

@clefourrier
Copy link
Member

Do you want us to wait for Alghafa 2 to merge this?

@alielfilali01
Copy link
Contributor Author

Do you want us to wait for Alghafa 2 to merge this?

Yes please @clefourrier , i will take some time before Saturday to add the new version of the benchmark

@clefourrier
Copy link
Member

No hurries, take your time!

@alielfilali01 alielfilali01 marked this pull request as draft March 6, 2024 20:54
@alielfilali01 alielfilali01 marked this pull request as ready for review March 8, 2024 18:35
@alielfilali01
Copy link
Contributor Author

Hello @clefourrier , I believe this PR is ready to be merged

alielfilali01 and others added 12 commits March 11, 2024 23:49
Add Support for the AlGhafa benchmarking suite
Adding support to the AlGhafa benchmarking suite
remove translated from AlGhafa
This file now contains all the arabic tasks including tasks not present in OALL_tasks.txt
Add support for ALGHAFA TRANSLATED  tasks
Add support to AlGhafa Translated benchmark suite (11 subsets)
minor fixes flagged by the pre-commit hook
forgot to remove 
`community|Alghafa:multiple_choice_copa_translated_task|5|1`
& `community|Alghafa:multiple_choice_openbookqa_translated_task|5|1` from ALGHAFA NATIVE
forgot to remove 
`community|Alghafa:multiple_choice_copa_translated_task|5|1`
& `community|Alghafa:multiple_choice_openbookqa_translated_task|5|1` from ALGHAFA NATIVE
Copy link
Member

@clefourrier clefourrier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but you need to homogeneize your naming:

  • Prompt names such as boolq_function will be unclear long term. For such functions, you could either use boolq_prompt_arabic or just boolq_arabic. (You need to specify the language since there is already a boolq prompt function by default.)
  • You also need to homogeneize Alghafa, which exists with several different casings, and fit it to Python style casing. For the prompt fonction, I'd keep it as alghafa_prompt or alghafa, for the class, CustomAlGhafaTask, and here for the name I'd keep it lower case
    [CustomAlGhafaTask(name=f"alghafa:{subset}", hf_subset=subset) for subset in ALGHAFA_SUBSETS]

alielfilali01 and others added 8 commits March 12, 2024 10:00
homogeneize naming according to the following comments :

####
Prompt names such as boolq_function will be unclear long term. For such functions, you could either use boolq_prompt_arabic or just boolq_arabic. (You need to specify the language since there is already a boolq prompt function by default.)

You also need to homogeneize Alghafa, which exists with several different casings, and fit it to Python style casing. For the prompt fonction, I'd keep it as alghafa_prompt or alghafa, for the class, CustomAlGhafaTask, and here for the name I'd keep it lower case
[CustomAlGhafaTask(name=f"alghafa:{subset}", hf_subset=subset) for subset in ALGHAFA_SUBSETS]
####
homogeneize AlGhafa naming : `Alghafa` to `alghafa`
homogeneize AlGhafa naming : `Alghafa` to `alghafa`
Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
Copy link
Member

@clefourrier clefourrier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. This needs a bit more changes, I tried to make what is requested clearer.
I also added comments about tasks level instructions that I had missed previously

alielfilali01 and others added 4 commits March 14, 2024 13:22
use the standard camel casing for classes:

(remove) class CustomALGHAFATask(LightevalTaskConfig):

(add) class CustomAlGhafaTask(LightevalTaskConfig):

Co-authored-by: Clémentine Fourrier <[email protected]>
Fixes based on Clementine's comments
@alielfilali01
Copy link
Contributor Author

@clefourrier I hope this answers to your comments, plz feel free to ping me if i missed anything (i have a tendency to forget 😅)
Again thanks a lot for the efforts 🤗

@clefourrier
Copy link
Member

Looks better thank you!
Do you have some reference models and scores against which I could check the implementation?
Or did you check it, and against which models? :)

@alielfilali01
Copy link
Contributor Author

Looks better thank you! Do you have some reference models and scores against which I could check the implementation? Or did you check it, and against which models? :)

Yes @clefourrier , I tested gpt2 using --max_samples=1 and everything was fine and I believe Hamza is on it to test on bigger models and push the results to the hub for further inspection. I'll update you as soon as i hear back from Hamza

@clefourrier
Copy link
Member

Sounds good, feel free to ping me whenever :)

@clefourrier clefourrier self-requested a review March 27, 2024 06:20
Copy link
Member

@clefourrier clefourrier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the edits and tests!

@clefourrier clefourrier merged commit ef631cf into huggingface:main Mar 27, 2024
@thevexx
Copy link

thevexx commented Apr 12, 2024

AlGhafa eval dataset is no longer available on Huggingface, any alternatives ?

@alielfilali01
Copy link
Contributor Author

AlGhafa eval dataset is no longer available on Huggingface, any alternatives ?

Hi there, Can you plz provide more context ? I have checked the eval code and it seems it works fine

@thevexx
Copy link

thevexx commented Apr 13, 2024

Hi there, Can you plz provide more context ? I have checked the eval code and it seems it works fine

Hi, yesterday the datasets disappeared from the OALL Huggingface account, now i can see them, thanks

@alielfilali01
Copy link
Contributor Author

Hi there, Can you plz provide more context ? I have checked the eval code and it seems it works fine

Hi, yesterday the datasets disappeared from the OALL Huggingface account, now i can see them, thanks

OOH I see, i had to make the datasets private for about 20 min yesterday cuz i was testing something, what a coincidence you checked it at the same time 😅
sorry for the inconvenience 🤗

hynky1999 pushed a commit that referenced this pull request May 22, 2025
- Add Support for the AlGhafa benchmarking suite

---------

Co-authored-by: Clémentine Fourrier <[email protected]>
NathanHB pushed a commit that referenced this pull request Sep 19, 2025
- Add Support for the AlGhafa benchmarking suite

---------

Co-authored-by: Clémentine Fourrier <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants