Fix `ViTForMaskedImageModeling` doc example by ydshieh · Pull Request #22186 · huggingface/transformers

ydshieh · 2023-03-15T16:32:23Z

What does this PR do?

amyeroberts · 2023-03-15T16:36:19Z

@ydshieh @alaradirik @fxmarty The issue coming from #22152 was an oversight on my part about breaking changes. Perhaps we should revert that PR first and then agree how to introduce this change as it is intended to be added to other vision model?

ydshieh · 2023-03-15T16:45:58Z

Oh, I thought it was a new model head! Indeed a breaking change there. Good for me to revert that PR, but would be nice to talk to Sylvain or Lysandre first (if you feel necessary). I will leave you judge.

Regarding a solution if we really want to have this new attribute and the new name MaskedImageCompletionOutput, adding a new property (named logits) to MaskedImageCompletionOutput might be a way, but I didn't think about this deeply.

HuggingFaceDocBuilderDev · 2023-03-15T16:46:37Z

The documentation is not available anymore as the PR was closed or merged.

ydshieh · 2023-03-15T16:47:15Z

Converted to draft for now

amyeroberts · 2023-03-15T16:52:05Z

@ydshieh Yep - let's get @LysandreJik and @sgugger 's opinions.

I think having the logits param is probably the best solution. As far as I know, it's very rare to check against the model output type itself. I believe reconstruction was chosen because of the ImageSuperResolutionOutput data class. As mentioned in the original PR - we probably do want a different model type to be returned as the documented shapes are incorrect.

@alaradirik - could you open a PR to revert the changes?

amyeroberts · 2023-03-15T16:53:01Z

Actually, it's late for @alaradirik. I'll open the PR now.

sgugger · 2023-03-15T17:33:58Z

Yes we can't rename the parameter in the outputs like that for a model that has been around for a bit. What is even more annoying is that the commit was in the release, so we will need to make a patch with the fix.

ydshieh · 2023-03-15T18:13:01Z

Close this PR as it's clear we will and have to definitely use the original logits.

alaradirik · 2023-03-16T08:09:57Z

Sorry for being late to comment, I added the MaskedImageCompletionOutput to replace the inaccurate MaskedLMOutput class used by the masked image modeling heads (ViT and DeiT). Neither of these models have any checkpoints on the hub as mentioned in #22152 . Swin's MIM head has its own output class but no fine-tuned checkpoints for the MIM task either.

With that said, ViT and Swin's MIM heads are implementations of SimMIM and SimMIM have recently released fine-tuned checkpoints for these two models (as opposed to the base model weights on the hub for Swin MIM head). I'm planning to convert these checkpoints and add a masked-image-completion pipeline after @sheonhan merges the ICT PR (a contemporary, better performing MIM model). It'd be great to add an output class and fix inaccurate class output (listed as a language model in the docs) before that. While logits is not an accurate output name in this case as the model returns full reconstructed images, I could replace reconstruction with logits and open a new PR.

What do you think @amyeroberts @sgugger?
CC @LysandreJik @ydshieh

ydshieh added 2 commits March 15, 2023 17:15

fix

1ff8d3e

fix

b0de38d

ydshieh changed the title ~~Fix ViTForMaskedImageModeling example in documentation~~ Fix ViTForMaskedImageModeling doc example Mar 15, 2023

ydshieh requested review from alaradirik, amyeroberts and fxmarty March 15, 2023 16:32

ydshieh mentioned this pull request Mar 15, 2023

Fix ViTForMaskedImageModeling example in documentation #22185

Closed

ydshieh marked this pull request as draft March 15, 2023 16:46

ydshieh closed this Mar 15, 2023

ydshieh deleted the fix_doc branch March 17, 2023 12:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `ViTForMaskedImageModeling` doc example#22186

Fix `ViTForMaskedImageModeling` doc example#22186
ydshieh wants to merge 2 commits into
mainfrom
fix_doc

ydshieh commented Mar 15, 2023

Uh oh!

amyeroberts commented Mar 15, 2023 •

edited

Loading

Uh oh!

ydshieh commented Mar 15, 2023 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Mar 15, 2023 •

edited

Loading

Uh oh!

ydshieh commented Mar 15, 2023

Uh oh!

amyeroberts commented Mar 15, 2023

Uh oh!

amyeroberts commented Mar 15, 2023

Uh oh!

sgugger commented Mar 15, 2023

Uh oh!

ydshieh commented Mar 15, 2023

Uh oh!

alaradirik commented Mar 16, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ydshieh commented Mar 15, 2023

What does this PR do?

Uh oh!

amyeroberts commented Mar 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh commented Mar 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Mar 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh commented Mar 15, 2023

Uh oh!

amyeroberts commented Mar 15, 2023

Uh oh!

amyeroberts commented Mar 15, 2023

Uh oh!

sgugger commented Mar 15, 2023

Uh oh!

ydshieh commented Mar 15, 2023

Uh oh!

alaradirik commented Mar 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

amyeroberts commented Mar 15, 2023 •

edited

Loading

ydshieh commented Mar 15, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 15, 2023 •

edited

Loading

alaradirik commented Mar 16, 2023 •

edited

Loading