Add TF port of BLIP by Rocketknight1 · Pull Request #22090 · huggingface/transformers

Rocketknight1 · 2023-03-10T17:10:42Z

Work in progress right now, will update this when it's closer to being ready!

HuggingFaceDocBuilderDev · 2023-03-10T17:25:03Z

The documentation is not available anymore as the PR was closed or merged.

Rocketknight1 · 2023-03-24T14:16:43Z

The TF port is mostly complete now and tests are passing locally - I just need to go around updating docs and auto classes and so on. The main code should be ready for review!

amyeroberts

Nice 🔥 Thanks for adding this model!

Mostly nits. The main comment is question whether TODO comments have been resolved.

gante

Oof, this is a long one -- good job on getting it to the finish line! It should be close to a mergeable state.

A few general comments:

Missing: new modeling files in doctests;
The PR has some minor issues that came from the PT implementation. It would be nice to correct them as well!
The training argument is missing 😱 It needs to be added throughout the code.

gante · 2023-03-25T15:52:11Z

        ("bert", "TFBertModel"),
        ("blenderbot", "TFBlenderbotModel"),
        ("blenderbot-small", "TFBlenderbotSmallModel"),
+        ("blip", "TFBlipModel"),


I think we are missing a few auto classes -- also missing on the PT side!

sgugger · 2023-03-27T13:38:39Z

Looks like there are many comments to address for now. Please ping me again when it's ready for second review!

Rocketknight1 · 2023-03-27T16:40:13Z

Got through a lot of the comments today, but I have a couple of other things to do - will try to finish them tomorrow!

Rocketknight1 · 2023-03-28T18:36:22Z

The last remaining big issue is that some of the pt-tf equivalence tests fail when weights don't match up between models. This is caused by the cross-attention weights not being built, presumably because those layers aren't being called in the forward pass. I'm working on figuring out why and resolving that!

Rocketknight1 · 2023-03-28T18:49:30Z

The issue seems to be that in all of our other models, cross-attention layers are only added when config.add_cross_attention is True, but in the case of BLIP it only checks config.is_decoder. As a result, the PyTorch models often initialize cross-attention layers that aren't used, which causes weight mismatch issues for us in crossloading tests, because TF only creates weights on first use.

gante

Looks good!

Two high-level items from the previous review remaining:

Missing: new modeling files in doctests;
The training argument is missing 😱 It needs to be added throughout the code.

Rocketknight1 · 2023-03-29T13:12:19Z

It's coming, don't worry! This cross-attention behaviour is just very odd and I'm trying to track it down first

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Rocketknight1 · 2023-03-30T15:48:15Z

Hi all! I've addressed all comments and local tests look good. The remaining issues are:

Converting checkpoints so the tests don't need from_pt
Maybe adding more auto classes

I'm not sure about the auto classes, though - they're missing in the original PT version of the model as well, so this didn't seem like the right PR to add them.

Rocketknight1 · 2023-03-30T15:48:46Z

cc @sgugger - I think this is ready for a final review at last!

gante

It has my blessing 🪄

sgugger

Thanks for your PR! It sadly cannot be merged until the pt/tf equivalence tests are all passing. There is no model tester that skips them in the code base, so let's not BLIP be the first one.

If the fact BLIP is an encoder/decoder make changes necessary to the base tests in the model tester classes. The test can be overwritten.

sgugger · 2023-03-30T17:27:06Z

        """
        return cls(config, **kwargs)

+    def invert_attention_mask(self, encoder_attention_mask: tf.Tensor) -> tf.Tensor:


This does not use the state, so better put this as a function in tf_utils. (same for the other two below)

We should probably cleanup the PyTorch side to do the same.

Done! I didn't touch the PyTorch side yet because that's a bigger refactor that touches several models, but I can do it in another PR after this if you want.

sgugger · 2023-03-30T17:36:14Z

+    @unittest.skip(reason="This test class covers encoder-decoder models that the base test does not work with.")
+    def test_pt_tf_model_equivalence(self):
+        pass


Needs to be rewritten then. We cannot skip this test.

Re-enabled!

sgugger · 2023-03-30T17:36:19Z

            self.assertIsNotNone(model)

+    @unittest.skip(reason="This test class covers encoder-decoder models that the base test does not work with.")
+    def test_pt_tf_model_equivalence(self):


Re-enabled!

Same it is not.

sgugger · 2023-03-30T17:36:25Z

            self.assertIsNotNone(model)
+
+    @unittest.skip(reason="This test class covers encoder-decoder models that the base test does not work with.")
+    def test_pt_tf_model_equivalence(self):


Re-enabled!

This one isn't either.

Rocketknight1 · 2023-03-31T14:05:58Z

Got it, I'll figure out some way to re-enable those tests, or override them with versions that do work!

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Rocketknight1 · 2023-04-03T19:22:50Z

@sgugger this should be ready for review with all comments addressed! The failing test is in an unrelated model

sgugger

Still some equivalence tests missing.

sgugger · 2023-04-03T19:31:42Z

        self._override_model_class = override_model_class

-    def get_inputs(self, pt_model, config):
+    def get_inputs(self, pt_model, tf_dummy_inputs, config):


The changes here seem unrelated to this PR and would be better in their own PR, no?

Fair! I added them because they were needed for the pt-to-tf code to port the BLIP models correctly. If you'd rather I move them to a separate PR though, that's fine!

sgugger · 2023-04-03T19:39:40Z

+    @unittest.skip(reason="This test class covers encoder-decoder models that the base test does not work with.")
+    def test_pt_tf_model_equivalence(self):
+        pass


sgugger · 2023-04-03T19:40:14Z

            self.assertIsNotNone(model)

+    @unittest.skip(reason="This test class covers encoder-decoder models that the base test does not work with.")
+    def test_pt_tf_model_equivalence(self):


Same it is not.

sgugger · 2023-04-03T19:40:22Z

            self.assertIsNotNone(model)
+
+    @unittest.skip(reason="This test class covers encoder-decoder models that the base test does not work with.")
+    def test_pt_tf_model_equivalence(self):


This one isn't either.

Rocketknight1 · 2023-04-03T22:07:02Z

@sgugger Sorry for the confusion - that equivalence test is present in both the test_modeling_tf_blip and test_modeling_blip file. Do we want to keep it in both?

sgugger · 2023-04-04T01:05:18Z

Yes we do.

sgugger

Failing tests are unrelated.

gante

(pt-to-tf changes LGTM 👍 )

Rocketknight1 · 2023-04-04T14:39:46Z

Going to leave the pt-to-tf changes in this PR rather than making a separate one, since they're needed for proper BLIP conversion!

* Initial commit * more stash commit * Yet another stash commit * yet more stash commit * Mostly working except for docs / repo consistency * Stop importing model list from torch file * Add TF BLIP models to docs * Add auto classes * Move get_text_features and get_image_features * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/blip/test_modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/blip/test_modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/models/blip/test_modeling_tf_blip_text.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use channels_last convolutions in TF (better performance + compatibility) * Remove _shape function * Move multi-line statement to one line in PT + TF * Specify tf.keras.layers instead of importing from it * Remove test_gradient_checkpointing and empty test_training methods * move some multi-line statements to one line * Update docstring for generate * Remove pruned heads set * Remove self.seq_len_dim * Fixed issues with loss computation, should resolve some tests. Also ensured that the PT version follows the config for output_attentions and output_hidden_states * ensure original model follows config in more cases * Skip the same cross-attention tests in the PT tests - didn't realize we did it twice! * Add training args throughout the models and layers * make fixup * Fix docstring for inputs_embeds * Add docstring for is_decoder * Add docstrings to text models * Remove redundant computation * Add unpack_inputs / keras_serializable * Add modeling_tf_blip to doctests * Add config classes for keras serialization * Changes to allow model porting with pt-to-tf * Quick fix to decoder head and test tweaks * Revert an issue with masking the embeddings outputs * Allow missing keys in some equivalence tests (for unused layers) * Add tf-pt equivalence tests back in * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make fixup * Refactor invert_attention_mask out into tf_utils * Re-enable cross-tests on the PT side too --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Rocketknight1 marked this pull request as ready for review March 24, 2023 14:15

Rocketknight1 requested review from amyeroberts, gante, sgugger and younesbelkada March 24, 2023 14:15

Rocketknight1 force-pushed the add_tf_blip branch 2 times, most recently from fb88fd4 to 120d189 Compare March 24, 2023 15:26

amyeroberts reviewed Mar 24, 2023

View reviewed changes

gante reviewed Mar 25, 2023

View reviewed changes

gante reviewed Mar 29, 2023

View reviewed changes

Rocketknight1 and others added 13 commits March 30, 2023 13:48

Initial commit

98b2afd

more stash commit

e557c34

Yet another stash commit

87767b0

yet more stash commit

d86ec34

Mostly working except for docs / repo consistency

35deb28

Stop importing model list from torch file

0a720e4

Add TF BLIP models to docs

490fc63

Add auto classes

6dc06bb

Move get_text_features and get_image_features

9fd4b76

Update src/transformers/models/blip/modeling_tf_blip.py

07f99eb

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update src/transformers/models/blip/modeling_tf_blip.py

8cfc37d

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update src/transformers/models/blip/modeling_tf_blip.py

1c47a2f

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update src/transformers/models/blip/modeling_tf_blip_text.py

70cfe55

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Rocketknight1 added 3 commits March 30, 2023 16:23

Add unpack_inputs / keras_serializable

f3062b1

Add modeling_tf_blip to doctests

77e365e

Add config classes for keras serialization

6fff45c

gante approved these changes Mar 30, 2023

View reviewed changes

sgugger reviewed Mar 30, 2023

View reviewed changes

Changes to allow model porting with pt-to-tf

34463ea

Rocketknight1 and others added 9 commits April 3, 2023 16:59

Quick fix to decoder head and test tweaks

60b7fb7

Revert an issue with masking the embeddings outputs

2a7f52d

Allow missing keys in some equivalence tests (for unused layers)

d962ac6

Add tf-pt equivalence tests back in

0a43f85

Update src/transformers/models/blip/modeling_tf_blip.py

09095d1

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/blip/modeling_tf_blip_text.py

dd88c83

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/blip/modeling_tf_blip_text.py

d0fd3d4

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

make fixup

9efd53c

Refactor invert_attention_mask out into tf_utils

afd5a9c

sgugger reviewed Apr 3, 2023

View reviewed changes

Re-enable cross-tests on the PT side too

41fe5e1

sgugger approved these changes Apr 4, 2023

View reviewed changes

gante approved these changes Apr 4, 2023

View reviewed changes

Rocketknight1 merged commit 5f3ea66 into main Apr 4, 2023

Rocketknight1 deleted the add_tf_blip branch April 4, 2023 15:05

Conversation

Rocketknight1 commented Mar 10, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Mar 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Rocketknight1 commented Mar 24, 2023

Uh oh!

amyeroberts left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sgugger commented Mar 27, 2023

Uh oh!

Rocketknight1 commented Mar 27, 2023

Uh oh!

Rocketknight1 commented Mar 28, 2023

Uh oh!

Rocketknight1 commented Mar 28, 2023

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 commented Mar 29, 2023

Uh oh!

Rocketknight1 commented Mar 30, 2023

Uh oh!

Rocketknight1 commented Mar 30, 2023

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Mar 10, 2023 •

edited

Loading