Skip to content

[typings] Automatically type decorator return types as tuple | X#43446

Open
tomaarsen wants to merge 23 commits intohuggingface:mainfrom
tomaarsen:feat/auto_decorator_return_typing
Open

[typings] Automatically type decorator return types as tuple | X#43446
tomaarsen wants to merge 23 commits intohuggingface:mainfrom
tomaarsen:feat/auto_decorator_return_typing

Conversation

@tomaarsen
Copy link
Member

@tomaarsen tomaarsen commented Jan 23, 2026

What does this PR do?

This PR updates the can_return_tuple and check_model_inputs capture_outputs (see #43446 (comment)) typings such that:

from transformers.utils import can_return_tuple
from transformers.utils.generic import check_model_inputs
try:
    from typing import reveal_type
except ImportError:
    from typing_extensions import reveal_type


@can_return_tuple
def my_func(foo: int, bar: int, **kwargs) -> dict[str, int]:
    return {
        "sum": foo + bar,
        "product": foo * bar,
    }

result = my_func(1, 2)
reveal_type(result)  # tuple[Unknown, ...] | dict[str, int]


@check_model_inputs
def my_second_func(foo: int, bar: int, **kwargs) -> dict[str, int | float]:
    return {
        "minus": float(foo - bar),
        "div": foo / bar,
    }

result2 = my_second_func(1, 2)
reveal_type(result2)  # tuple[Unknown, ...] | dict[str, int | float]

In short: functions and method that use can_return_tuple or check_model_inputs must currently be typed like tuple | X, but can now be typed like X as well.

I also had Copilot help me write a check_decorator_return_types.py file that ensures that functions decorated with can_return_tuple or check_model_inputs:

  1. Have an explicit, non-None return annotation.
  2. Are not annotated with a union that already includes tuple.

The latter can be auto-fixed with python utils/check_decorator_return_types.py --fix_and_overwrite, and the former must be manually fixed. This checking script is included in make fix-repo and the CI. The script only runs in about ~5 seconds for me.

Before submitting

Who can review?

cc @zucchini-nlp @vasqu @tarekziade

  • Tom Aarsen

…rn type

This updates the typings of these two functions, so that a wrapped function that has return type X is automatically typed as `tuple | X`.
It verifies that users don't use e.g. `tuple | BaseModelOutputWithPooling` return typings anymore, as they should use `BaseModelOutputWithPooling` instead then. It also makes sure that a typing is used
This is the main actual change, the rest is just typings
And remove return_dict from altclip and clap: the can_return_tuple should take care of it fully
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Comment on lines +37 to +38
P = ParamSpec("P")
T = TypeVar("T")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewer note: the ParamSpec and TypeVar as used here match Python's own typing documentation exactly: https://docs.python.org/3.10/library/typing.html#typing.ParamSpec

They also use an example of a decorator, so the implementation here is quite standard/common.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't hurt to add the doc link as comment to the wrappers we use it at below - it's like 1:1 matching the example :D

Copy link
Member Author

@tomaarsen tomaarsen Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very true! Done in dbd43d6 & 3fd9571


self.post_init()

@check_model_inputs
Copy link
Member Author

@tomaarsen tomaarsen Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewer note: There are only 2 actual code changes under src/transformers/models, and this is one of them. This class does not return a BaseModelOutput, and the other related private classes (BltLocalEncoder, BltGlobalTransformer) don't use this @check_model_inputs either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see, so it is an issue with how the model was implemented. So many tiny inconsistencies 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, they shouldn't inherit from PreTrained model at all it seems like 😅 it's not a high priority model so fine for now

def set_input_embeddings(self, value):
self.embeddings.word_embeddings = value

@can_return_tuple
Copy link
Member Author

@tomaarsen tomaarsen Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewer note: There are only 2 actual code changes under src/transformers/models, and this is one of them. This class copies from CLAP, which uses @can_return_tuple, but this class did not. I've added it here.

I also updated this class and the CLAP variant to remove the return_dict = return_dict if return_dict is not None else self.config.use_return_dict line, as that was just dead code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! Ideally would be great to start using check_model_inputs for pretrained models. Though it might require more manual work than current state of PR

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! I have just a few questions about the auto-checker

def set_input_embeddings(self, value):
self.embeddings.word_embeddings = value

@can_return_tuple
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! Ideally would be great to start using check_model_inputs for pretrained models. Though it might require more manual work than current state of PR


self.post_init()

@check_model_inputs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accidentally deleted?


self.post_init()

@check_model_inputs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see, so it is an issue with how the model was implemented. So many tiny inconsistencies 😅

Copy link
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

Maybe we should start splitting into more than fix-repo again @Cyrilvallez 👀


self.post_init()

@check_model_inputs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, they shouldn't inherit from PreTrained model at all it seems like 😅 it's not a high priority model so fine for now

Comment on lines +37 to +38
P = ParamSpec("P")
T = TypeVar("T")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't hurt to add the doc link as comment to the wrappers we use it at below - it's like 1:1 matching the example :D

@github-actions
Copy link
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43446&sha=3fd957

@tomaarsen
Copy link
Member Author

I updated this PR with @Cyrilvallez' latest changes regarding model_check_inputs -> capture_outputs, so that methods wrapped with capture_outputs are also nicely typed:

from transformers.utils import can_return_tuple
from transformers.utils.output_capturing import capture_outputs

try:
    from typing import reveal_type
except ImportError:
    from typing_extensions import reveal_type


@can_return_tuple
def my_func(foo: int, bar: int, **kwargs) -> dict[str, int]:
    return {
        "sum": foo + bar,
        "product": foo * bar,
    }


result = my_func(1, 2)
reveal_type(result)  # tuple[Unknown, ...] | dict[str, int]


@capture_outputs
def my_second_func(foo: int, bar: int, **kwargs) -> dict[str, int | float]:
    return {
        "minus": float(foo - bar),
        "div": foo / bar,
    }


result2 = my_second_func(1, 2)
reveal_type(result2)  # tuple[Unknown, ...] | dict[str, int | float]
ty check demo.py
info[revealed-type]: Revealed type
  --> demo_can_return_tuple_hint.py:19:13
   |
18 | result = my_func(1, 2)
19 | reveal_type(result)  # tuple[Unknown, ...] | dict[str, int]
   |             ^^^^^^ `tuple[Unknown, ...] | dict[str, int]`
   |

info[revealed-type]: Revealed type
  --> demo_can_return_tuple_hint.py:31:13
   |
30 | result2 = my_second_func(1, 2)
31 | reveal_type(result2)  # tuple[Unknown, ...] | dict[str, int | float]
   |             ^^^^^^^ `tuple[Unknown, ...] | dict[str, int | float]`
   |

Found 2 diagnostics

This should help cut down some bloat from our modeling/modular files without considerably complicating any of the decorators. Ideally, we can get this merged somewhat quickly, as the merge conflicts start quickly too.

also cc @molbap as you're also working on decorator-related cleanup in #43590 right now

  • Tom Aarsen

@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

[For maintainers] Suggested jobs to run (before merge)

run-slow: afmoe, aimv2, albert, align, altclip, aria, audioflamingo3, aya_vision, bert, bert_generation, blip, blip_2, blt, bridgetower, bros, camembert

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants