-
Notifications
You must be signed in to change notification settings - Fork 157
bugfix: make the data translator registry prefer exact cls matches over isinstance matches #2547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
cc8bc97 to
1e367c9
Compare
1e367c9 to
75f2676
Compare
| def get_handler_for(self, data_or_class): | ||
| for translator in self: | ||
| if isinstance(data_or_class, translator.target_cls) or data_or_class is translator.target_cls: | ||
| if data_or_class is translator.target_cls or type(data_or_class) is translator.target_cls: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle should we do the loop twice, once over exact matches then once with isinstance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think as long as the isinstance match also triggers a break in the same loop, it would still match if that translator happens to come ahead of the one here.
dhomeier
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been trying to make sense of how the subclass matches are handled so far, especially why e.g. LightCurve, being also a QTable subclass, would not get at least that translator rather than the Table one, but it appears that also can behave differently for a subclass and instances of the former. So this PR seems to get a long way towards more reasonable behaviour, but as noted in the comments, should try to also treat classes and their instances the same as much as possible.
Possibly still open question: should the best match always take precedence even over very different priorities? Seems priority has not been used much at all, so perhaps not that relevant.
| def get_handler_for(self, data_or_class): | ||
| for translator in self: | ||
| if isinstance(data_or_class, translator.target_cls) or data_or_class is translator.target_cls: | ||
| if data_or_class is translator.target_cls or type(data_or_class) is translator.target_cls: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think as long as the isinstance match also triggers a break in the same loop, it would still match if that translator happens to come ahead of the one here.
| handler = translator.handler | ||
| preferred = translator.target_cls | ||
| break | ||
| if isinstance(data_or_class, translator.target_cls): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if isinstance(data_or_class, translator.target_cls): | |
| if isinstance(data_or_class, translator.target_cls) or (isinstance(data_or_class, type) and issubclass(data_or_class, translator.target_cls)): |
A subclass (rather than an instance of one) of a supported target_cls currently is not caught at all, which I think can lead to quite confusing behaviour as well.
Perhaps the whole code should be cleaned up a bit by pulling the type check below to the top of the loop
if isinstance(data_or_class, type):
data_class = data_or_class
else:
data_class = type(data_or_class)and work with data_class from there on.
|
Coming back to this I don't fully understand why the suggested change would work any better since instead of checking isinstance we now check for exact match then for isinstance, but we do both these checks for the current type in the loop. So let's say the order of the iteration is to check Table then Lightcurve, and the user passed a lightcurve object, then the code would do:
And it would never get to checking lightcurve. So I guess there are two questions:
|
Description
This PR supports more precise control over data translators in the registry. In a STScI use case, we have added a translator for a specific subclass (
lightkurve.LightCurve) of a base class that already is in the registry (astropy.table.Tablevia glue-astronomy). We noticed that a glue translator for the base class (Table) is getting returned as the matching handler viadata_translator.get_handler_for, even though we have an exact class match for the subclass (LightCurve) in the registry.This PR gives preference to exact class matches in the translator registry. It now checks for exact matches via
data_or_class is expected_clsbetween data class and the expected class in the translator registry, before following up with the originalisinstancecall if no exact match is found.