Skip to content

Conversation

alozowski
Copy link
Collaborator

This PR enables evaluation of MCQ-style QA datasets generated by Yourbench, expanding beyond open-ended QA.

Key Changes

  • New prompt template for MCQs using and tags.
  • Custom accuracy logic via custom_metric_compute
  • Answer parsing handled via extract_content_from_xml_tags helper
  • Integrated into LightevalTaskConfig as yourbench_mcq

@alozowski alozowski requested review from NathanHB and clefourrier May 16, 2025 07:51
@HuggingFaceDocBuilderDev
Copy link
Collaborator

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@clefourrier clefourrier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm

@alozowski alozowski merged commit 317cb50 into main May 20, 2025
5 checks passed
hynky1999 added a commit that referenced this pull request May 22, 2025
* Add MCQ support to Yourbench evaluation

---------

Co-authored-by: Hynek Kydlíček <[email protected]>
NathanHB pushed a commit that referenced this pull request Sep 19, 2025
* Add MCQ support to Yourbench evaluation

---------

Co-authored-by: Hynek Kydlíček <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants