Skip to content

Add common processor tests#27720

Open
NielsRogge wants to merge 5 commits intohuggingface:mainfrom
NielsRogge:add_common_processor_tests
Open

Add common processor tests#27720
NielsRogge wants to merge 5 commits intohuggingface:mainfrom
NielsRogge:add_common_processor_tests

Conversation

@NielsRogge
Copy link
Contributor

@NielsRogge NielsRogge commented Nov 27, 2023

What does this PR do?

Multimodal processors currently don't have common tests. This PR aims to work towards having a common API for our multimodal processors, making sure they all have the same inputs and outputs (e.g. making sure text+vision processors accept text as first kwarg, then images, etc.).

As a first work, I refactor the CLIP and BLIP-2 processor tests to leverage the common ones.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

def tearDown(self):
shutil.rmtree(self.tmpdirname)

def prepare_image_inputs(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ydshieh would be great to get your first feedback on this PR (and potentially your help to extend this to all processors).

I think we may need to create separate ImageTextProcessorTesterMixin and AudioTextProcessorTextMin classes given that the former use image processor + tokenizer, whereas the latter use feature extractor + tokenizer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not in favor of creating 2 classes. Define prepare_inputs(input_type), where input_type could be text, image or audio, and the method just returns what is requested.

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave some comment regarding a few re-design suggestions.

Comment on lines +28 to +30
tokenizer_class = None
fast_tokenizer_class = None
image_processor_class = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could avoid using these attributes. With a given processor_class (one line below), we can already access image_processor_class, tokenizer_class or feature_extractor_class.

Comment on lines +33 to +34
def get_tokenizer(self, **kwargs):
return self.tokenizer_class.from_pretrained(self.tmpdirname, **kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to consider the case where a processor may use only the fast (or only the slow) tokenizer.

Comment on lines +36 to +39
def get_fast_tokenizer(self, **kwargs):
return self.fast_tokenizer_class.from_pretrained(self.tmpdirname, **kwargs)

def get_image_processor(self, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to just have get_components inside which we preare all components of a process.

This way, we don't have to define something like get_image_processor which is not necessary applied to all the processor classes in the library.

def tearDown(self):
shutil.rmtree(self.tmpdirname)

def prepare_image_inputs(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not in favor of creating 2 classes. Define prepare_inputs(input_type), where input_type could be text, image or audio, and the method just returns what is requested.

Comment on lines +57 to +59
tokenizer_slow = self.get_tokenizer()
tokenizer_fast = self.get_fast_tokenizer()
image_processor = self.get_image_processor()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use get_components (suggested in the above comment) to get all the components to instantiate a processor.

The test could be rewritten to handle all the possible component types rather than just tokenizer+image as below.


processor = self.processor_class(tokenizer=tokenizer, image_processor=image_processor)

predicted_ids = [[1, 4, 5, 8, 1, 0, 8], [3, 4, 3, 1, 1, 8, 9]]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this expected value will only work for a particular (or a few) class(es).

@ydshieh ydshieh mentioned this pull request Dec 14, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Jan 5, 2024

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@ydshieh
Copy link
Collaborator

ydshieh commented Jan 8, 2024

@NielsRogge If you are OK with that, let's put the tests in another PR #27761 and close this one?

@NielsRogge
Copy link
Contributor Author

Isn't the other PR for saving processors? Or will it also add common tests?

@ydshieh
Copy link
Collaborator

ydshieh commented Jan 8, 2024

Yes, that is a PR about saving processors. But we want to have tests to make sure it works as we want it. (to avoid later changes of files on the Hub - which is always almost impossible).

add common tests?

Yes, probably with a different version than this PR

@huggingface huggingface deleted a comment from github-actions bot Feb 2, 2024
@github-actions github-actions bot closed this Mar 6, 2024
@huggingface huggingface deleted a comment from github-actions bot Mar 14, 2024
@ydshieh ydshieh reopened this Mar 14, 2024
@amyeroberts amyeroberts added the WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress label Apr 8, 2024
@huggingface huggingface deleted a comment from github-actions bot Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants