Skip to content

clean up vision/text config dict arguments#19954

Merged
ydshieh merged 4 commits into
mainfrom
cleanup_vision_text_config_dict
Nov 2, 2022
Merged

clean up vision/text config dict arguments#19954
ydshieh merged 4 commits into
mainfrom
cleanup_vision_text_config_dict

Conversation

@ydshieh

@ydshieh ydshieh commented Oct 28, 2022

Copy link
Copy Markdown
Collaborator

What does this PR do?

Remove vision_config_dict and text_config_dict: just use vision_config and text_config.

  • Make code base cleaner
  • Avoid surprising behavior (see the comment)

@HuggingFaceDocBuilderDev

HuggingFaceDocBuilderDev commented Oct 28, 2022

Copy link
Copy Markdown

The documentation is not available anymore as the PR was closed or merged.

@ydshieh

ydshieh commented Oct 28, 2022

Copy link
Copy Markdown
Collaborator Author

Without this PR, we have somehow surprising/confusing results

from transformers import CLIPConfig, CLIPModel

config = CLIPConfig.from_pretrained("openai/clip-vit-base-patch16")
print(config.vision_config.patch_size)
print(config.vision_config_dict["patch_size"])

config.vision_config.patch_size = 32
config.save_pretrained("v2")

config_v2 = CLIPConfig.from_pretrained("v2")
# This is not `32` which is unexpected!
# In fact, it is `vision_config_dict` is being used during loading to set `vision_config`
print(config_v2.vision_config.patch_size)
# This is 32 - unexpected!
print(config_v2.vision_config_dict["patch_size"])

config.vision_config_dict["patch_size"] = 32
config.save_pretrained("v3")

config_v3 = CLIPConfig.from_pretrained("v3")
# This is 32 - unexpected!
print(config_v3.vision_config.patch_size)
# This is 32 - OK
print(config_v3.vision_config_dict["patch_size"])

super().__init__(text_config_dict=text_config_dict, vision_config_dict=vision_config_dict, **kwargs)
super().__init__(**kwargs)

# If `_config_dict` exist, we use them for the backward compatibility.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For backward compatibility

@ydshieh

ydshieh commented Oct 28, 2022

Copy link
Copy Markdown
Collaborator Author

@sgugger If you are happy with the current change, I will apply the changes to some other models, and the testing files.
So far it is good even if I don't change to_dict. It has already

output["text_config"] = self.text_config.to_dict()
output["vision_config"] = self.vision_config.to_dict()

@ydshieh ydshieh requested a review from sgugger October 28, 2022 16:29
**kwargs
):
super().__init__(text_config=text_config, vision_config=vision_config, **kwargs)
super().__init__(**kwargs)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to pass text/vision config to super, as we will set self.text_config and self.vision_config below

@sgugger sgugger left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but pinging @patrickvonplaten and @patil-suraj here too as it may have implications in Diffusers.

@NielsRogge

Copy link
Copy Markdown
Collaborator

Awesome that you are working on fixing this!

Encountered the same issue with a new model I'm working on called CLIPSeg.

Also, could we update GroupViT as well? This is also a CLIP-like model.

@ydshieh ydshieh force-pushed the cleanup_vision_text_config_dict branch from 3b33586 to a983507 Compare November 2, 2022 10:45
@ydshieh ydshieh merged commit 8827e1b into main Nov 2, 2022
@ydshieh ydshieh deleted the cleanup_vision_text_config_dict branch November 2, 2022 11:03
mpierrau pushed a commit to mpierrau/transformers that referenced this pull request Dec 15, 2022
* clean up

* For backward compatibility

* clean up

* Same changes for more models

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants