Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support ArrayLike data in to_xarray #825

Merged
merged 1 commit into from
Sep 11, 2024
Merged

support ArrayLike data in to_xarray #825

merged 1 commit into from
Sep 11, 2024

Conversation

egpbos
Copy link
Member

@egpbos egpbos commented Jul 26, 2024

This commit makes dianna swallow more kinds of data items without users having to mangle them manually themselves. As long as your data is numpy ArrayLike it will go. The only blocker for this was the direct use of the .ndim attribute on data, which assumes it is already a numpy array rather than something that can trivially be converted to a numpy array. A PIL.Image is an example of such an ArrayLike type. We use this in explainable_embedding to feed items into the OpenAI CLIP model.

I sprinkled in some nice type hinting, of course, but not for the other parameters, because they would become monstrosities.

@egpbos egpbos requested a review from loostrum July 26, 2024 13:49
egpbos added a commit to dianna-ai/explainable_embedding that referenced this pull request Jul 26, 2024
Either the image_arr is wrong (load_img runs a preprocess function that is meant for use with ResNet50, perhaps that's the issue) or the uint8 conversion kills something, but either way, the "fix" introduced there completely ruined the output for image vs captioning experiments! This commit fixes things again.

Note that the np.array(input_image) could be replaced by a bare input_image if dianna PR dianna-ai/dianna#825 is accepted.
@egpbos
Copy link
Member Author

egpbos commented Jul 29, 2024

Fixed the import issue from the previous CI build (tested locally). However, now we have a different, unrelated issue that we have to wait to get fixed (see #827 and #829).

This commit makes dianna swallow more kinds of data items without users having to mangle them manually themselves. As long as your data is numpy ArrayLike it will go. The only blocker for this was the direct use of the .ndim attribute on data, which assumes it is already a numpy array rather than something that can trivially be converted to a numpy array. A PIL.Image is an example of such an ArrayLike type. We use this in explainable_embedding to feed items into the OpenAI CLIP model.
@loostrum
Copy link
Member

I rebased on main to get the linter fix in here. The changes themselves look good, nice generalization of to_xarray! Linter's happy as well, so let's merge.

@elboyran
Copy link
Contributor

elboyran commented Sep 4, 2024

@egpbos many checks are not passing.

@egpbos
Copy link
Member Author

egpbos commented Sep 4, 2024

Yep, those are mentioned above. They are unrelated to this PR.

@cwmeijer
Copy link
Member

closing and reopening for rerunning tests

@cwmeijer cwmeijer closed this Sep 11, 2024
@cwmeijer cwmeijer reopened this Sep 11, 2024
@cwmeijer cwmeijer merged commit 4f32a58 into main Sep 11, 2024
20 of 33 checks passed
@cwmeijer cwmeijer deleted the to_xarray_arraylike branch September 11, 2024 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants