[typings] Automatically type decorator return types as tuple | X#43446
[typings] Automatically type decorator return types as tuple | X#43446tomaarsen wants to merge 23 commits intohuggingface:mainfrom
typings] Automatically type decorator return types as tuple | X#43446Conversation
…rn type This updates the typings of these two functions, so that a wrapped function that has return type X is automatically typed as `tuple | X`.
It verifies that users don't use e.g. `tuple | BaseModelOutputWithPooling` return typings anymore, as they should use `BaseModelOutputWithPooling` instead then. It also makes sure that a typing is used
This is the main actual change, the rest is just typings
And remove return_dict from altclip and clap: the can_return_tuple should take care of it fully
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| P = ParamSpec("P") | ||
| T = TypeVar("T") |
There was a problem hiding this comment.
Reviewer note: the ParamSpec and TypeVar as used here match Python's own typing documentation exactly: https://docs.python.org/3.10/library/typing.html#typing.ParamSpec
They also use an example of a decorator, so the implementation here is quite standard/common.
There was a problem hiding this comment.
Doesn't hurt to add the doc link as comment to the wrappers we use it at below - it's like 1:1 matching the example :D
|
|
||
| self.post_init() | ||
|
|
||
| @check_model_inputs |
There was a problem hiding this comment.
Reviewer note: There are only 2 actual code changes under src/transformers/models, and this is one of them. This class does not return a BaseModelOutput, and the other related private classes (BltLocalEncoder, BltGlobalTransformer) don't use this @check_model_inputs either.
There was a problem hiding this comment.
ah I see, so it is an issue with how the model was implemented. So many tiny inconsistencies 😅
There was a problem hiding this comment.
Actually, they shouldn't inherit from PreTrained model at all it seems like 😅 it's not a high priority model so fine for now
| def set_input_embeddings(self, value): | ||
| self.embeddings.word_embeddings = value | ||
|
|
||
| @can_return_tuple |
There was a problem hiding this comment.
Reviewer note: There are only 2 actual code changes under src/transformers/models, and this is one of them. This class copies from CLAP, which uses @can_return_tuple, but this class did not. I've added it here.
I also updated this class and the CLAP variant to remove the return_dict = return_dict if return_dict is not None else self.config.use_return_dict line, as that was just dead code.
There was a problem hiding this comment.
nice! Ideally would be great to start using check_model_inputs for pretrained models. Though it might require more manual work than current state of PR
zucchini-nlp
left a comment
There was a problem hiding this comment.
Great work! I have just a few questions about the auto-checker
| def set_input_embeddings(self, value): | ||
| self.embeddings.word_embeddings = value | ||
|
|
||
| @can_return_tuple |
There was a problem hiding this comment.
nice! Ideally would be great to start using check_model_inputs for pretrained models. Though it might require more manual work than current state of PR
|
|
||
| self.post_init() | ||
|
|
||
| @check_model_inputs |
|
|
||
| self.post_init() | ||
|
|
||
| @check_model_inputs |
There was a problem hiding this comment.
ah I see, so it is an issue with how the model was implemented. So many tiny inconsistencies 😅
vasqu
left a comment
There was a problem hiding this comment.
Lgtm
Maybe we should start splitting into more than fix-repo again @Cyrilvallez 👀
|
|
||
| self.post_init() | ||
|
|
||
| @check_model_inputs |
There was a problem hiding this comment.
Actually, they shouldn't inherit from PreTrained model at all it seems like 😅 it's not a high priority model so fine for now
| P = ParamSpec("P") | ||
| T = TypeVar("T") |
There was a problem hiding this comment.
Doesn't hurt to add the doc link as comment to the wrappers we use it at below - it's like 1:1 matching the example :D
This reverts commit f71a072.
Without extra code; this time
|
View the CircleCI Test Summary for this PR: https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43446&sha=3fd957 |
|
I updated this PR with @Cyrilvallez' latest changes regarding from transformers.utils import can_return_tuple
from transformers.utils.output_capturing import capture_outputs
try:
from typing import reveal_type
except ImportError:
from typing_extensions import reveal_type
@can_return_tuple
def my_func(foo: int, bar: int, **kwargs) -> dict[str, int]:
return {
"sum": foo + bar,
"product": foo * bar,
}
result = my_func(1, 2)
reveal_type(result) # tuple[Unknown, ...] | dict[str, int]
@capture_outputs
def my_second_func(foo: int, bar: int, **kwargs) -> dict[str, int | float]:
return {
"minus": float(foo - bar),
"div": foo / bar,
}
result2 = my_second_func(1, 2)
reveal_type(result2) # tuple[Unknown, ...] | dict[str, int | float]This should help cut down some bloat from our modeling/modular files without considerably complicating any of the decorators. Ideally, we can get this merged somewhat quickly, as the merge conflicts start quickly too. also cc @molbap as you're also working on decorator-related cleanup in #43590 right now
|
|
[For maintainers] Suggested jobs to run (before merge) run-slow: afmoe, aimv2, albert, align, altclip, aria, audioflamingo3, aya_vision, bert, bert_generation, blip, blip_2, blt, bridgetower, bros, camembert |
What does this PR do?
This PR updates the
can_return_tupleandcheck_model_inputscapture_outputs(see #43446 (comment)) typings such that:In short: functions and method that use
can_return_tupleorcheck_model_inputsmust currently be typed liketuple | X, but can now be typed likeXas well.I also had Copilot help me write a
check_decorator_return_types.pyfile that ensures that functions decorated withcan_return_tupleorcheck_model_inputs:Nonereturn annotation.tuple.The latter can be auto-fixed with
python utils/check_decorator_return_types.py --fix_and_overwrite, and the former must be manually fixed. This checking script is included inmake fix-repoand the CI. The script only runs in about ~5 seconds for me.Before submitting
Pull Request section?
to it if that's the case: [
BC] Updateget_(text|image|audio|video)_featuresmethods to returnBaseModelOutputWithPooling#42564 (comment)documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
cc @zucchini-nlp @vasqu @tarekziade