Skip to content

Conversation

@ydshieh
Copy link
Collaborator

@ydshieh ydshieh commented Oct 30, 2023

What does this PR do?

Fix some tests using "common_voice".

Errors:

FileNotFoundError: https://voice-prod-bundler-ee1969a6ce8178826482b88e843c335139bd3fb4.s3.amazonaws.com/cv-corpus-6.1-2020-12-11/en.tar.gz

and

               This version of the Common Voice dataset is deprecated.
              You can download the latest one with
              >>> load_dataset("mozilla-foundation/common_voice_11_0", "en")

We can load the same dataset using "mozilla-foundation/common_voice_6_1", but this requires passing token which is not good for testing purpose.

So I change it to "mozilla-foundation/common_voice_11_0" but update expected outputs.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 30, 2023

The documentation is not available anymore as the PR was closed or merged.

@ydshieh ydshieh requested a review from amyeroberts October 30, 2023 13:18
Copy link
Contributor

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating!

We might want to have a dataset fixture which is just a small subset of this (or an equivalent) dataset so we're remain independent of changes on the hub like this

@ydshieh
Copy link
Collaborator Author

ydshieh commented Oct 30, 2023

That is in the plan :-)

@ydshieh ydshieh merged commit 5769949 into main Oct 30, 2023
@ydshieh ydshieh deleted the fix_common_voice branch October 30, 2023 14:27
EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 19, 2023
* Use mozilla-foundation/common_voice_11_0

* Update expected values

* Update expected values

* For test_word_time_stamp_integration

---------

Co-authored-by: ydshieh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants