-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datasets crashing runs due to KeyError #6124
Comments
i once had the same error and I could fix that by pushing a fake or a dummy commit on my hugging face dataset repo |
Hi! We need a reproducer to fix this. Can you provide a link to the dataset (if it's public)? |
Hi Mario, Unfortunately, the dataset in question is currently private until the model is trained and released. This is not happening with one dataset but numerous hosted private datasets. I am only loading the dataset and doing nothing else currently. It seems to happen completely sporadically. Thank you, Enrico |
Hi, I have the same error in the dataset viewer with my dataset Has anyone solved this issue? Edit: After a dummy commit the error changed in ConfigNamesError |
@rs9000 The problem seems to be the (large) number of commits, as explained in https://huggingface.co/docs/hub/repositories-recommendations. This can be fixed by running: import huggingface_hub
huggingface_hub.super_squash_history(repo_id="elsaEU/ELSA10M_track1") The issue stems from |
Thank you @mariosasko it works. |
#6269 has been merged, so I'm closing this issue |
Describe the bug
Hi all,
I have been running into a pretty persistent issue recently when trying to load datasets.
I receive a KeyError which crashes the runs.
Any help would be greatly appreciated.
Thank you,
Enrico
Steps to reproduce the bug
Load the dataset from the Huggingface hub.
Expected behavior
Loads the dataset.
Environment info
datasets-2.14.3
CUDA 11.8
Python 3.11
The text was updated successfully, but these errors were encountered: