Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MATH dataset used for mathematics tutorial no longer accessible #1283

Open
ChiWilliams opened this issue Feb 10, 2025 · 1 comment
Open

MATH dataset used for mathematics tutorial no longer accessible #1283

ChiWilliams opened this issue Feb 10, 2025 · 1 comment

Comments

@ChiWilliams
Copy link

The Mathematics tutorial on the tutorial page relies on the huggingface version of the MATH dataset, which was recently taken down for copyright reasons. The existing code will throw an error:
Couldn't find a dataset script at LOCALPATH or any data file in the same directory. Couldn't find 'hendrycks/competition_math' on the Hugging Face Hub either (where LOCALPATH is a path on my computer where the data would have been stored.

There are copies of the dataset on the hub that are still accessible and which work with this tutorial, if a quick fix is necessary. (But that might run into copyright trouble?)

@jjallaire
Copy link
Collaborator

Argh! The point of this tutorial is to demonstrate a custom scorer. I think we really just need another dataset which benefits from the expression equivalence solver we demonstrate. Are you aware of any good candidates? (agree we wouldn't point to any of the copies).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants