🚨Default to fast image processors for all models by yonigozlan · Pull Request #41388 · huggingface/transformers

yonigozlan · 2025-10-06T19:11:56Z

What does this PR do?

Following the trial testing with Qwen_VL image processors, this extends defaulting to fast image processors even for checkpoints saved with a slow one to all models.

Also made sure that all processors use AutoImageProcessor to instantiate their image_processor_class.
On that point, defining default subclass in processors feels a bit redundant, as we basically already have that in auto classes. It would be nice to get rid of this for v5, wdyt @molbap @zucchini-nlp @ArthurZucker ?
I'll open a PR for that too.

HuggingFaceDocBuilderDev · 2025-10-06T19:21:11Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

molbap

Sounds good for v5! Let's see if we can even simplify further in this iteration

molbap · 2025-10-07T07:47:17Z

+        if common_kwargs:
+            for kwarg in output_kwargs.values():
+                kwarg.update(common_kwargs)
+


I'm sure there's a good reason but I'm missing it, why is this moved up?

yep, it is a fix from #41381 :)

aah, which fixes #40931, got it

Yes mb, switch to a new branch without checking out main first 🥴

molbap · 2025-10-07T07:49:33Z

    def __init__(self, **kwargs):
        super().__init__(**kwargs)
+        if not self.is_fast:
+            logger.warning_once(
+                f"Using a slow image processor (`{self.__class__.__name__}`). "
+                "As we are transitioning to fast (PyTorch-native) processors, consider using `AutoImageProcessor` or the model-specific fast image processor class "
+                "to instantiate a fast image processor."
+            )


SGTM!

Related, since we're touching on the topic of "loading old models from the hub with new utils" this is related to the "from_pretrained conversion" @Cyrilvallez is working on, if we have modifications to apply to some old image processors, they should be in the from_pretrained as well to "convert" the processor in the same sense.

zucchini-nlp

LGTM. Just wondering about some models where we had no lancsoz resampling. Do we get the closest resampling in those cases and are the diffs small enough?

zucchini-nlp · 2025-10-07T09:12:32Z

 class VideoLlavaProcessor(ProcessorMixin):
    r"""
-    Constructs a VideoLlava processor which wraps a VideoLlava image processor and a Llava tokenizer into a single processor.
+    Constructs a VideoLlava processor which wraps a AutoImageProcessor and a Llava tokenizer into a single processor.


nit: imo we need not change the name when it is not referenced. Instead we only change the "[VideoLlavaImageProcessor] " one line below

Yes you're right, not very useful to have AutoImageProcessor as in the docstring. I'll change these back. I'm also working on getting auto_docstring to work on processors which should do all that automatically (check which subprocessors are in auto for this model) ;)

I'm also working on getting auto_docstring to work on processors which should do all that automaticall

nice, very needed

zucchini-nlp · 2025-10-07T09:13:22Z

+        if common_kwargs:
+            for kwarg in output_kwargs.values():
+                kwarg.update(common_kwargs)
+


yep, it is a fix from #41381 :)

yonigozlan · 2025-10-07T16:40:25Z

LGTM. Just wondering about some models where we had no lancsoz resampling. Do we get the closest resampling in those cases and are the diffs small enough?

Good point for the lanczos sampling, I might add an exception for these, as the diffs are not close enough imo

…asses

…rom-processors

… (temporarily)

…rom-processors

…m/yonigozlan/transformers into remove-attributes-from-processors

…l models #41388

…all models #41388

ydshieh · 2025-12-07T12:47:38Z

I have pushed the updates. The following remaining failures needs some fix (not about expected output mismatching)

(if you want to read the log, you can go here and select ⚙️ icon and click View raw logs on the top left side)

This one

RUN_SLOW=1 python3 -m pytest -v tests/models/janus/test_modeling_janus.py::JanusIntegrationTest::test_model_generate_images - ValueError: Only returning PyTorch tensors is currently supported.

plus the following

{
    "clipseg": {
        "single-gpu": [
            {
                "test": "tests/models/clipseg/test_modeling_clipseg.py::CLIPSegModelIntegrationTest::test_inference_image_segmentation",
                "commit": "07a50c395552a28582c2746e06318e8f2e1bf059",
                "status": "git bisect found the bad commit.",
                "pr_number": null,
                "author": "ydshieh",
                "merged_by": null,
                "parent": "377a8ee73f210476c4efb15170d0c32ad3b2c653"
            }
        ]
    },
    "flava": {
        "single-gpu": [
            {
                "test": "tests/models/flava/test_modeling_flava.py::FlavaForPreTrainingIntegrationTest::test_inference",
                "commit": "07a50c395552a28582c2746e06318e8f2e1bf059",
                "status": "git bisect found the bad commit.",
                "pr_number": null,
                "author": "ydshieh",
                "merged_by": null,
                "parent": "377a8ee73f210476c4efb15170d0c32ad3b2c653"
            }
        ]
    },
    "gemma3": {
        "single-gpu": [
            {
                "test": "tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch_crops",
                "commit": "07a50c395552a28582c2746e06318e8f2e1bf059",
                "status": "git bisect found the bad commit.",
                "pr_number": null,
                "author": "ydshieh",
                "merged_by": null,
                "parent": "377a8ee73f210476c4efb15170d0c32ad3b2c653"
            },
            {
                "test": "tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_crops",
                "commit": "07a50c395552a28582c2746e06318e8f2e1bf059",
                "status": "git bisect found the bad commit.",
                "pr_number": null,
                "author": "ydshieh",
                "merged_by": null,
                "parent": "377a8ee73f210476c4efb15170d0c32ad3b2c653"
            }
        ]
    },
    "yolos": {
        "single-gpu": [
            {
                "test": "tests/models/yolos/test_modeling_yolos.py::YolosModelIntegrationTest::test_inference_object_detection_head",
                "commit": "07a50c395552a28582c2746e06318e8f2e1bf059",
                "status": "git bisect found the bad commit.",
                "pr_number": null,
                "author": "ydshieh",
                "merged_by": null,
                "parent": "377a8ee73f210476c4efb15170d0c32ad3b2c653"
            }
        ]
    }
}

…om/yonigozlan/transformers into default-fast-image-proc-all-models

…ace#42750) * stack lists of tensors in BatchFeature, improve error messages, add tests * remove unnecessary stack in fast image processors and video processors * make style * fix tests

yonigozlan · 2025-12-12T17:47:38Z

Hey @ydshieh ! All the remaining tests should be fixed. I tried to merge with main, but there seems to be a lot of broken tests due to tokenization issues on main (the gemma3 and janus integration tests are broken at least)

ArthurZucker

Hey @yonigozlan wdyt about first removing slow / fast concept ? otherwise it looks good to me but I think there were memory leak issues that we needed to fix before pushing this! once that's done happy to merge!

yonigozlan · 2025-12-16T17:41:39Z

I think it will be easier to transition to a unified image processor backend (defaulting on torch/torchvision) if we first merge this PR.
Also do you have more info on memory leaks? Not sure what this alludes to

…proc-all-models

ArthurZucker

LGTM my main comment is about functions like extracting the device I think there should be a better way for us to track where the images are

ArthurZucker · 2025-12-19T15:11:54Z

        self.assertEqual(
            generated_text,
-            "The image depicts a man ironing clothes on the back of a yellow van in the middle of a busy city street. The man is wearing a yellow shirt with a yellow tie, and he is using an ironing board attached to the back of the van. The image is unusual in that it shows a man ironing clothes on the back of a van in the middle of a busy city street. The man is using an ironing board attached to the back of a van in the middle of a busy city street. The man is using an ironing board attached to the back of a van in the middle of a busy city street. The image is unusual in that it shows a man ironing clothes on the back of a van in the middle of a busy city street. The man is using an ironing board attached to the back of a van in the middle of a busy city street.",
+            "The image depicts a man ironing clothes on the back of a yellow van in the middle of a busy city street. The man is wearing a yellow shirt with a yellow tie, and he is holding an ironing board in one hand and a laundry basket in the other. The image is unusual in that it shows a man ironing clothes on the back of a van in the middle of a busy city street.",


that's a lot of changes!

ArthurZucker · 2026-01-19T10:43:29Z

+        if not self.is_fast:
+            logger.warning_once(
+                f"Using a slow image processor (`{self.__class__.__name__}`). "
+                "As we are transitioning to fast (PyTorch-native) processors, consider using `AutoImageProcessor` "


Suggested change

"As we are transitioning to fast (PyTorch-native) processors, consider using `AutoImageProcessor` "

"As we are transitioning to PyTorch-native processors, consider using `AutoImageProcessor` "

let's refrain this in prevision for the non fast/slow paradigmm

ArthurZucker · 2026-01-19T10:44:10Z

+        if not self.is_fast:
+            logger.warning_once(
+                f"Using a slow image processor (`{self.__class__.__name__}`). "
+                "As we are transitioning to fast (PyTorch-native) processors, consider using `AutoImageProcessor` "


not sur we need to warn as by default it will be fast by default

ArthurZucker · 2026-01-19T10:47:37Z

            yield i, item


+def _get_device_from_images(images, is_nested: bool) -> "torch.device":


If the processor is the one creating the torch tensor then I would suppose that there is a way to store the device in the data structure that it creates instead of having this function

This is mainly to avoid having to pass around a device argument to all group_image_by_shape calls, when it's easy to deduce it

ArthurZucker · 2026-01-19T10:50:12Z

+    raise ValueError("No images found in the batch.")
+
+
+def get_device_from_images(images_list: list[list["torch.Tensor"]]) -> "torch.device":


not sure why we need this when extracting the device should be exactly the same for every single image processor no?

Some pass nested images to group_image_by_shape, and some have structures with empty lists, so this is needed for edge cases

…proc-all-models

github-actions · 2026-01-21T15:04:36Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: altclip, aria, auto, aya_vision, chinese_clip, clip, clipseg, convnext, convnextv2, cvt, efficientloftr, fuyu, idefics2, idefics3, janus, lightglue

github-actions · 2026-01-21T15:43:21Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=41388&sha=aca384

* remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Fix most tests * Standardize mgp_str tests * fix oneformer processing tests + fix copies * fix after review * fix missing fet_images in fast image processors * fix 01 - to check * fix 02 - to check * fix 03 - to check * fix 03 - to check * fix 03 - to check * fix 04 - to check * fix 05 - to check * fix 06 - sytle * fix 07 - revert * Fix some errors * Improve BatchFeature: stack list and lists of torch tensors (huggingface#42750) * stack lists of tensors in BatchFeature, improve error messages, add tests * remove unnecessary stack in fast image processors and video processors * make style * fix tests * fix remaining tests * fix copies * Fix Lfm2_vl im proc test * nit after review --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

molbap reviewed Oct 7, 2025

View reviewed changes

zucchini-nlp reviewed Oct 7, 2025

View reviewed changes

yonigozlan added 3 commits October 15, 2025 15:47

remove attributes and add all missing sub processors to their auto cl…

f48a47b

…asses

remove all mentions of .attributes

d5d5c58

cleanup

dd505b5

yonigozlan mentioned this pull request Oct 15, 2025

[v5] 🚨Refactor subprocessors handling in processors #41633

Merged

yonigozlan and others added 22 commits October 15, 2025 17:27

fix processor tests

6a1448f

fix modular

a292900

remove last attributes

63a255d

fixup

ef73759

Merge remote-tracking branch 'upstream/main' into remove-attributes-f…

b5e8b2e

…rom-processors

fixes after merge

f14ff3c

fix wrong tokenizer in auto florence2

0306430

fix missing audio_processor + nits

01cb815

Override __init__ in NewProcessor and change hf-internal-testing-repo…

49ec906

… (temporarily)

Merge remote-tracking branch 'upstream/main' into remove-attributes-f…

7dd5682

…rom-processors

fix auto tokenizer test

946cc5c

add init to markup_lm

b0cb3e0

update CustomProcessor in custom_processing

3b9e846

remove print

53de7a4

Merge branch 'main' into remove-attributes-from-processors

93d2c4d

Merge remote-tracking branch 'upstream/main' into remove-attributes-f…

feeec28

…rom-processors

nit

4a6b080

Merge branch 'remove-attributes-from-processors' of https://github.co…

02402a0

…m/yonigozlan/transformers into remove-attributes-from-processors

refactor processor tests first part

9204b4c

refactor part 2

1ed7c56

fix test modeling owlv2

757e1f1

fix test_processing_layoutxlm

bf763b2

ydshieh added 5 commits December 6, 2025 17:28

fix 03 - to check

a48b577

fix 04 - to check

c29f9b0

fix 05 - to check

e24559c

fix 06 - sytle

f04e642

fix 07 - revert

88610fc

ydshieh added a commit that referenced this pull request Dec 7, 2025

trigger dummy CI manually for Default to fast image processors for al…

62021ca

…l models #41388

ydshieh added a commit that referenced this pull request Dec 7, 2025

trigger partial CI manually for Default to fast image processors for …

bffe8b4

…all models #41388

ArthurZucker mentioned this pull request Dec 8, 2025

Outstanding issues / PR before we can release v5 #42710

Closed

yonigozlan changed the title ~~Default to fast image processors for all models~~ 🚨Default to fast image processors for all models Dec 8, 2025

yonigozlan added 2 commits December 9, 2025 18:07

Fix some errors

6d56e56

Merge branch 'default-fast-image-proc-all-models' of https://github.c…

04d4145

…om/yonigozlan/transformers into default-fast-image-proc-all-models

yonigozlan mentioned this pull request Dec 11, 2025

Improve BatchFeature: stack list and lists of torch tensors #42750

Merged

yonigozlan and others added 2 commits December 12, 2025 17:12

Improve BatchFeature: stack list and lists of torch tensors (huggingf…

624aad6

…ace#42750) * stack lists of tensors in BatchFeature, improve error messages, add tests * remove unnecessary stack in fast image processors and video processors * make style * fix tests

fix remaining tests

587209c

ArthurZucker approved these changes Dec 16, 2025

View reviewed changes

yonigozlan added 3 commits December 18, 2025 16:19

Merge remote-tracking branch 'upstream/main' into default-fast-image-…

21d2fc4

…proc-all-models

fix copies

1602059

Fix Lfm2_vl im proc test

0bbe085

ArthurZucker approved these changes Jan 19, 2026

View reviewed changes

yonigozlan and others added 3 commits January 21, 2026 01:27

Merge remote-tracking branch 'upstream/main' into default-fast-image-…

ff4e4c7

…proc-all-models

nit after review

aade130

Merge branch 'main' into default-fast-image-proc-all-models

acdb89f

Merge branch 'main' into default-fast-image-proc-all-models

aca384b

ArthurZucker merged commit 3fec8c2 into huggingface:main Jan 21, 2026
23 of 25 checks passed

	"As we are transitioning to fast (PyTorch-native) processors, consider using `AutoImageProcessor` "
	"As we are transitioning to PyTorch-native processors, consider using `AutoImageProcessor` "

		yield i, item


		def _get_device_from_images(images, is_nested: bool) -> "torch.device":

		raise ValueError("No images found in the batch.")


		def get_device_from_images(images_list: list[list["torch.Tensor"]]) -> "torch.device":

Conversation

yonigozlan commented Oct 6, 2025

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 6, 2025

Uh oh!

molbap left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yonigozlan commented Oct 7, 2025

Uh oh!

ydshieh commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yonigozlan commented Dec 12, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

yonigozlan commented Dec 16, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jan 21, 2026

Uh oh!

github-actions Bot commented Jan 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ydshieh commented Dec 7, 2025 •

edited

Loading