Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a rex backend for xarray (rexarray) #192

Merged
merged 74 commits into from
Apr 2, 2025
Merged

Implement a rex backend for xarray (rexarray) #192

merged 74 commits into from
Apr 2, 2025

Conversation

ppinchuk
Copy link
Collaborator

@ppinchuk ppinchuk commented Feb 3, 2025

Add a backend for xarray that allows users to read in rex-style NREL data.

The implementation itself closely follows the h5netcdf backend implementation:
https://github.com/pydata/xarray/blob/main/xarray/backends/h5netcdf_.py

Lazy loading is fully supported, which is a big reason why this implementation is so long.

HSDS and S3 (via fsspec) access is explicitly supported.

@ppinchuk ppinchuk self-assigned this Feb 3, 2025
@ppinchuk ppinchuk requested a review from castelao February 28, 2025 17:43
@ppinchuk ppinchuk changed the title [WIP] Implement a rex backend for xarray (rexarray) Implement a rex backend for xarray (rexarray) Feb 28, 2025
Copy link
Collaborator

@bnb32 bnb32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take a deeper dive later but here are just a couple random comments.

@ppinchuk
Copy link
Collaborator Author

@bnb32, @grantbuster, @castelao Friendly bump :)

Curious to hear if anyone has tried using this and/or had success with it

Copy link
Member

@castelao castelao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ppinchuk , some minor comments.

@bnb32
Copy link
Collaborator

bnb32 commented Mar 19, 2025

@bnb32, @grantbuster, @castelao Friendly bump :)

Curious to hear if anyone has tried using this and/or had success with it

Digging back in is on the list for next week :)

Copy link
Collaborator

@bnb32 bnb32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks great to me. I messed around in a notebook a little bit and things worked well, including within calls to the sup3r Loader objects. A couple random comments added. Main question on the use of time_index vs time.

class RexArrayWrapper(BackendArray):
"""rexarray implementation of a `BackendArray`"""

__slots__ = ("datastore", "dtype", "shape", "variable_name", "meta_index",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you find that using slots here reduces mem use significantly? This is one of those things I'm aware of but rarely use.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used __slots__ quite heavily in my grad work, but not so much at NREL. IIRC the guidance is that it can be beneficial if you intend to have thousands of instances of the object. Not sure if that actually applies here, but this is what the xarray developers did for their array wrappers and I figured they had a good reason for doing so (so I copied them😄).

@grantbuster
Copy link
Member

Dare I say this is ready for review? :P

One question for all, but especially @grantbuster : should we make fsspec a base rex dependency? I think this library is used all over the place, including in our other dependencies like pandas. This would mean users can immediately access S3 files without having to specify [s3] during installation. Did we make it an optional dependency intentionally? I know we had to make h5pyd optional because of compatibility issues with the wtk code

Fine by me to add fsspec as a default req!

ppinchuk and others added 9 commits April 1, 2025 13:59

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ensure that rexarray can open mf hsds with list input
@ppinchuk ppinchuk merged commit 194c212 into main Apr 2, 2025
22 checks passed
@ppinchuk ppinchuk deleted the pp/rexarray branch April 2, 2025 22:31
github-actions bot pushed a commit that referenced this pull request Apr 2, 2025
Implement a `rex` backend for xarray (rexarray)
github-actions bot pushed a commit to MRE-Code-Hub/rex that referenced this pull request Apr 2, 2025
Implement a `rex` backend for xarray (rexarray)
@ppinchuk ppinchuk mentioned this pull request Apr 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Update to logic or general code improvements p-medium Priority: medium
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants