Add common processor tests by NielsRogge · Pull Request #27720 · huggingface/transformers

NielsRogge · 2023-11-27T09:56:56Z

What does this PR do?

Multimodal processors currently don't have common tests. This PR aims to work towards having a common API for our multimodal processors, making sure they all have the same inputs and outputs (e.g. making sure text+vision processors accept text as first kwarg, then images, etc.).

As a first work, I refactor the CLIP and BLIP-2 processor tests to leverage the common ones.

HuggingFaceDocBuilderDev · 2023-11-27T10:12:34Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

NielsRogge · 2023-11-27T14:47:58Z

tests/test_processing_common.py

+    def tearDown(self):
+        shutil.rmtree(self.tmpdirname)
+
+    def prepare_image_inputs(self):


@ydshieh would be great to get your first feedback on this PR (and potentially your help to extend this to all processors).

I think we may need to create separate ImageTextProcessorTesterMixin and AudioTextProcessorTextMin classes given that the former use image processor + tokenizer, whereas the latter use feature extractor + tokenizer.

I am not in favor of creating 2 classes. Define prepare_inputs(input_type), where input_type could be text, image or audio, and the method just returns what is requested.

ydshieh

Leave some comment regarding a few re-design suggestions.

ydshieh · 2023-12-11T15:51:29Z

tests/test_processing_common.py

+    tokenizer_class = None
+    fast_tokenizer_class = None
+    image_processor_class = None


I think we could avoid using these attributes. With a given processor_class (one line below), we can already access image_processor_class, tokenizer_class or feature_extractor_class.

ydshieh · 2023-12-11T15:52:41Z

tests/test_processing_common.py

+    def get_tokenizer(self, **kwargs):
+        return self.tokenizer_class.from_pretrained(self.tmpdirname, **kwargs)


We have to consider the case where a processor may use only the fast (or only the slow) tokenizer.

ydshieh · 2023-12-11T15:58:52Z

tests/test_processing_common.py

+    def get_fast_tokenizer(self, **kwargs):
+        return self.fast_tokenizer_class.from_pretrained(self.tmpdirname, **kwargs)
+
+    def get_image_processor(self, **kwargs):


It might be good to just have get_components inside which we preare all components of a process.

This way, we don't have to define something like get_image_processor which is not necessary applied to all the processor classes in the library.

ydshieh · 2023-12-11T16:04:23Z

tests/test_processing_common.py

+    def tearDown(self):
+        shutil.rmtree(self.tmpdirname)
+
+    def prepare_image_inputs(self):


I am not in favor of creating 2 classes. Define prepare_inputs(input_type), where input_type could be text, image or audio, and the method just returns what is requested.

ydshieh · 2023-12-11T16:09:06Z

tests/test_processing_common.py

+        tokenizer_slow = self.get_tokenizer()
+        tokenizer_fast = self.get_fast_tokenizer()
+        image_processor = self.get_image_processor()


Could use get_components (suggested in the above comment) to get all the components to instantiate a processor.

The test could be rewritten to handle all the possible component types rather than just tokenizer+image as below.

ydshieh · 2023-12-11T16:10:55Z

tests/test_processing_common.py

+
+        processor = self.processor_class(tokenizer=tokenizer, image_processor=image_processor)
+
+        predicted_ids = [[1, 4, 5, 8, 1, 0, 8], [3, 4, 3, 1, 1, 8, 9]]


this expected value will only work for a particular (or a few) class(es).

github-actions · 2024-01-05T08:03:58Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

ydshieh · 2024-01-08T09:06:41Z

@NielsRogge If you are OK with that, let's put the tests in another PR #27761 and close this one?

NielsRogge · 2024-01-08T13:39:30Z

Isn't the other PR for saving processors? Or will it also add common tests?

ydshieh · 2024-01-08T13:45:22Z

Yes, that is a PR about saving processors. But we want to have tests to make sure it works as we want it. (to avoid later changes of files on the Hub - which is always almost impossible).

add common tests?

Yes, probably with a different version than this PR

Add first draft

eb1522d

NielsRogge added 4 commits November 27, 2023 11:32

Fix style

04feb26

Fix tests

95fc2ad

Address ruff

528e273

Add BLIP-2 tests

9fd17b4

NielsRogge commented Nov 27, 2023

View reviewed changes

NielsRogge mentioned this pull request Nov 27, 2023

Add InstructBlip to VQA pipeline #26885

Closed

5 tasks

molbap mentioned this pull request Nov 30, 2023

[WIP] Uniformize processors in text+image multimodal models. #27768

Draft

33 tasks

ydshieh self-requested a review December 11, 2023 14:01

ydshieh reviewed Dec 11, 2023

View reviewed changes

ydshieh mentioned this pull request Dec 14, 2023

Save Processor #27761

Merged

huggingface deleted a comment from github-actions bot Feb 2, 2024

github-actions bot closed this Mar 6, 2024

huggingface deleted a comment from github-actions bot Mar 14, 2024

ydshieh reopened this Mar 14, 2024

amyeroberts added the WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress label Apr 8, 2024

huggingface deleted a comment from github-actions bot Apr 8, 2024

		def get_tokenizer(self, **kwargs):
		return self.tokenizer_class.from_pretrained(self.tmpdirname, **kwargs)


		processor = self.processor_class(tokenizer=tokenizer, image_processor=image_processor)

		predicted_ids = [[1, 4, 5, 8, 1, 0, 8], [3, 4, 3, 1, 1, 8, 9]]

Conversation

NielsRogge commented Nov 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Nov 27, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ydshieh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 5, 2024

Uh oh!

ydshieh commented Jan 8, 2024

Uh oh!

NielsRogge commented Jan 8, 2024

Uh oh!

ydshieh commented Jan 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

NielsRogge commented Nov 27, 2023 •

edited

Loading