Allow fallback to loading from Auto"SubProcessor".from_pretrained when model_type can't be inferred from config by yonigozlan · Pull Request #42402 · huggingface/transformers

yonigozlan · 2025-11-25T16:51:42Z

What does this PR do?

Also remove some dead code in processing_auto.py

HuggingFaceDocBuilderDev · 2025-11-25T17:01:01Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

tomaarsen

This PR, when combined with #42387 to fix PEFT loading, fixes my bug reproduction script from #41846. Thanks for opening this @yonigozlan. We should likely aim for a review from a core transformers maintainer, though.

cc @BenjaminBossan this is also looking good for your PR #42387 🤗

BenjaminBossan · 2025-11-26T13:11:57Z

Thanks, Yoni. Yeah, let's try to get this and my PR merged for v5.

zucchini-nlp

lgt, only one comment about deleting feature extractors as fallback

zucchini-nlp · 2025-11-26T13:31:45Z

src/transformers/models/auto/processing_auto.py

-                if preprocessor_config_file is not None and processor_class is None:
-                    config_dict, _ = FeatureExtractionMixin.get_feature_extractor_dict(
-                        pretrained_model_name_or_path, **kwargs
-                    )
-                    processor_class = config_dict.get("processor_class", None)
-                    if "AutoProcessor" in config_dict.get("auto_map", {}):
-                        processor_auto_map = config_dict["auto_map"]["AutoProcessor"]
-


ig this was deleted because feature extractor and image processor have the same name when saved? There is still one difference though if the config was saved in the new nested format. I think we need to keep it

transformers/src/transformers/feature_extraction_utils.py

Lines 486 to 487 in f779506

if "feature_extractor" in processor_dict or "audio_processor" in processor_dict:

feature_extractor_dict = processor_dict.get("feature_extractor", processor_dict.get("audio_processor"))

I removed this because it is actually never used unless I missed something. It's already called here:

transformers/src/transformers/models/auto/processing_auto.py

Lines 297 to 304 in f779506

preprocessor_config_file = cached_file(

pretrained_model_name_or_path, FEATURE_EXTRACTOR_NAME, **cached_file_kwargs

)

if preprocessor_config_file is not None:

config_dict, _ = ImageProcessingMixin.get_image_processor_dict(pretrained_model_name_or_path, **kwargs)

processor_class = config_dict.get("processor_class", None)

if "AutoProcessor" in config_dict.get("auto_map", {}):

processor_auto_map = config_dict["auto_map"]["AutoProcessor"]

and if we preprocessor_config_file is None above, it will be None here as well. If it's not None, then we'll never enter the code path I removed. So it seems like dead code

oh you're right! It has a weird dependency because of this identical naming issue 🙃

I think we have to enter this codepath if processor_class is None because we would need to try to load the processor from feature extractor config if we can't still find it at this point. I think we have no tests for this specific case and it's quite rare which is why no issues until today. I still would like to keep it for full functionality

Hmm I still think this code path is never entered 😅. But this can be addressed in another PR anyway, so I added this back.

yonigozlan · 2025-11-26T15:25:01Z

Thanks for the reviews! @zucchini-nlp could you approve pls ;)

Cyrilvallez

LGTM thanks!

github-actions · 2025-12-02T17:16:53Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto

…raise-error-early

…n model_type can't be inferred from config (huggingface#42402) * fix raise error early * add back feature extractor saving logic

fix raise error early

afc5313

yonigozlan requested review from tomaarsen and zucchini-nlp November 25, 2025 16:51

tomaarsen reviewed Nov 26, 2025

View reviewed changes

zucchini-nlp reviewed Nov 26, 2025

View reviewed changes

yonigozlan requested review from Cyrilvallez and zucchini-nlp November 26, 2025 15:25

Cyrilvallez approved these changes Dec 1, 2025

View reviewed changes

add back feature extractor saving logic

9379edf

Merge remote-tracking branch 'upstream/main' into fix-auto-processor-…

295d136

…raise-error-early

yonigozlan enabled auto-merge (squash) December 2, 2025 17:18

yonigozlan merged commit 675e876 into huggingface:main Dec 2, 2025
23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Allow fallback to loading from Auto"SubProcessor".from_pretrained when model_type can't be inferred from config#42402

Allow fallback to loading from Auto"SubProcessor".from_pretrained when model_type can't be inferred from config#42402
yonigozlan merged 3 commits intohuggingface:mainfrom
yonigozlan:fix-auto-processor-raise-error-early

yonigozlan commented Nov 25, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Nov 25, 2025

Uh oh!

tomaarsen left a comment

Uh oh!

BenjaminBossan commented Nov 26, 2025

Uh oh!

zucchini-nlp left a comment

Uh oh!

zucchini-nlp Nov 26, 2025

Uh oh!

yonigozlan Nov 26, 2025

Uh oh!

zucchini-nlp Nov 26, 2025

Uh oh!

yonigozlan Dec 2, 2025

Uh oh!

yonigozlan commented Nov 26, 2025

Uh oh!

Cyrilvallez left a comment

Uh oh!

github-actions bot commented Dec 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

	if "feature_extractor" in processor_dict or "audio_processor" in processor_dict:
	feature_extractor_dict = processor_dict.get("feature_extractor", processor_dict.get("audio_processor"))

	preprocessor_config_file = cached_file(
	pretrained_model_name_or_path, FEATURE_EXTRACTOR_NAME, **cached_file_kwargs
	)
	if preprocessor_config_file is not None:
	config_dict, _ = ImageProcessingMixin.get_image_processor_dict(pretrained_model_name_or_path, **kwargs)
	processor_class = config_dict.get("processor_class", None)
	if "AutoProcessor" in config_dict.get("auto_map", {}):
	processor_auto_map = config_dict["auto_map"]["AutoProcessor"]

Comments

Conversation

yonigozlan commented Nov 25, 2025

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Nov 25, 2025

Uh oh!

tomaarsen left a comment

Choose a reason for hiding this comment

Uh oh!

BenjaminBossan commented Nov 26, 2025

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

yonigozlan Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

yonigozlan Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

yonigozlan commented Nov 26, 2025

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants