Skip to content

Conversation

@younesbelkada
Copy link
Contributor

@younesbelkada younesbelkada commented Dec 1, 2022

What does this PR do?

PoC for adding .to support on ImageProcessors

related #20453

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Dec 1, 2022

The documentation is not available anymore as the PR was closed or merged.

@younesbelkada
Copy link
Contributor Author

younesbelkada commented Dec 1, 2022

Here is a quick v1, but I am afraid it's a bit too much in the sense that I am literally testing every possible combination
Also regarding tests, we can remove them or put them as slow. I checked with deit, vit & vilt (for multimodal setup) and the tests are green (the failing test for LayoutLM can be easily fixed)
May I ask you to have a quick look @sgugger @ydshieh ? Thanks!

@sgugger
Copy link
Collaborator

sgugger commented Dec 1, 2022

You're looking at something too complicated: to() does all that work for you already. You can pass it a string, a device or a dtype.

@younesbelkada
Copy link
Contributor Author

younesbelkada commented Dec 1, 2022

Yes I was thinking of something very complicated where someone could set .to(device, dtype) let's maybe keep it even simpler and force the user to put only a single argument in .to ?

EDIT: it seems that there is a workaround for that

@younesbelkada younesbelkada changed the title [draft] .to function for ImageProcessors [Vision] .to function for ImageProcessors Dec 2, 2022
@younesbelkada younesbelkada marked this pull request as ready for review December 2, 2022 10:49

if isinstance(x, str):
if hasattr(torch, x):
return True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be more careful -> i.e. to get that attribute, than test if it is indeed a torch.dtype

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean

if hasattr(torch, x):
    x = getattr(torch, x)
else:
    return False

?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, x = getattr(torch, x) and then test if it is a torch dtype, just as what you did a few line below.

Be careful, I always complicate things. You can hold on until sylvain's review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adapted your suggestion in da35051 ;)

Copy link
Contributor

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening this PR! Looking good :)

Just a few small comments and +1 on @ydshieh's suggestions on _is_torch_dtype.

I'm still unsure about the behaviour with to when some of the inputs are ints, as it doesn't match PyTorch (I can cast ints to floats), however with logging it's probably OK. I wonder if logger.warning might be too aggressive a log level as it's not something the user can really change.

class DeiTFeatureExtractionTest(FeatureExtractionSavingTestMixin, unittest.TestCase):

feature_extraction_class = DeiTFeatureExtractor if is_vision_available() else None
test_cast_dtype = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it needed here for DeiT and not other models?

Copy link
Contributor Author

@younesbelkada younesbelkada Dec 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just waited for everyone's approval on the concept and planned to add it on more models! Probably on the main models only as it may slow down the CI test suite

younesbelkada and others added 3 commits December 2, 2022 23:46
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
@younesbelkada
Copy link
Contributor Author

Thanks everyone for the feedback! Let me know if you think it's relevant to add the test_cast_dtype for all ImageProcessors as it may slow down our CI testing suite

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the int tensors, I have comments on the actual implementation however.

@younesbelkada younesbelkada requested a review from sgugger December 5, 2022 15:40
import torch # noqa

new_data = {}
device = self._parse_args_to_device(*args, **kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not fan of the name :-p

Also, until it's used in another place, I would avoid doing a separate method for this. Let's just put the code there!

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bearing with me :-)

@younesbelkada
Copy link
Contributor Author

Ahaha no worries! thanks for all the iterations 💪

@younesbelkada younesbelkada merged commit ef0f85c into huggingface:main Dec 5, 2022
mpierrau pushed a commit to mpierrau/transformers that referenced this pull request Dec 15, 2022
* add v1 with tests

* add checker

* simplified version

* update docstring

* better version

* fix docstring + change order

* make style

* tests + change conditions

* final tests

* modify docstring

* Update src/transformers/feature_extraction_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* replace by `ValueError`

* fix logic

* apply suggestions

* `dtype` is not needed

* adapt suggestions

* remove `_parse_args_to_device`

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants