Skip to content

Conversation

@sgugger
Copy link
Collaborator

@sgugger sgugger commented Jan 19, 2023

What does this PR do?

This PR continues the work on docstrings and removes all checkpoints from the hf-internal-testing org where they can be removed.

@sgugger sgugger requested a review from ydshieh January 19, 2023 19:13
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jan 19, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. I believe there are 2 places to correct, and one for you to decide (reformer)


_CHECKPOINT_FOR_DOC = "allenai/longformer-base-4096"
_CONFIG_FOR_DOC = "LongformerConfig"
_TOKENIZER_FOR_DOC = "LongformerTokenizer"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you intended to change the file modeling_longformer.py instead of this TF file. As the TF_xxx_SAMPLE still have something like (at this moment)

from transformers import {processor_class}, {model_class}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, thanks!


Returns:

<Tip warning={true}>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could use _CHECKPOINT_FOR_DOC but remove the parts that check the expected output values, and we don't need this warning?

At least, this is what you have done for some models that used tiny-random-xxx checkpoints.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Hmm, it raises a question: when we use real checkpoints which don't contain head weights, so far we don't include the output checks, but we don't do any warning. The warning introduced in your last PR is only used for 2 models so far where no small enough real checkpoints exist).

Would like to hear from you regarding this - as I remember one major reason is to avoid user confusion regarding the results they get.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the checkpoint fails because it has is_decoder=True and we need False to use the MLM head.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks for explaining :-)


# General docstring
_CONFIG_FOR_DOC = "UniSpeechSatConfig"
_PROCESSOR_FOR_DOC = "Wav2Vec2Processor"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not able to comment at the exact line, but for UniSpeechSatForPreTraining, I think you missed to change Wav2Vec2FeatureExtractor (you changed this for another model file UniSpeechForPreTraining)

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM + thank you again!

@sgugger sgugger merged commit 7fc1cb1 into main Jan 20, 2023
@sgugger sgugger deleted the remove_hf_internal_checkpoints branch January 20, 2023 18:20
@sgugger sgugger mentioned this pull request Jan 20, 2023
@ydshieh ydshieh mentioned this pull request Jan 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants