-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically chunk other
in GroupBy binary ops.
#7684
Conversation
if obj.__dask_graph__() is not None and other.__dask_graph__() is None: | ||
# a chunk size of 1 seems reasonable since we expect it to be repeated | ||
# TODO: what about dims other than `name`` | ||
# TODO: What about datasets with some dask vars, and others not? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know which variables will be operated on at this point? (I'm not very familiar with this part of the code). Surely we only need to chunk the variables that will be operated on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm.... I guess technically we could have different data vars in obj
and other
…lazy-array * upstream/main: (153 commits) Add HDF5 Section to read/write docs page (pydata#8012) [pre-commit.ci] pre-commit autoupdate (pydata#8014) Update interpolate_na in dataset.py (pydata#7974) improved docstring of to_netcdf (issue pydata#7127) (pydata#7947) Expose "Coordinates" as part of Xarray's public API (pydata#7368) Core team member guide (pydata#7999) join together duplicate entries in the text `repr` (pydata#7225) Update copyright year in README (pydata#8007) Allow opening datasets with nD dimenson coordinate variables. (pydata#7989) Move whats-new entry [pre-commit.ci] pre-commit autoupdate (pydata#7997) Add documentation on custom indexes (pydata#6975) Use variable name in all exceptions raised in `as_variable` (pydata#7995) Bump pypa/gh-action-pypi-publish from 1.8.7 to 1.8.8 (pydata#7994) New whatsnew section Remove future release notes before this release Update whats-new.rst for new release (pydata#7993) Remove hue_style from plot1d docstring (pydata#7925) Add new what's new section (pydata#7986) Release summary for v2023.07.0 (pydata#7979) ...
Well this was a big regression for the tiny benchmark problem. I still think it's a good idea given https://discourse.pangeo.io/t/xarray-unable-to-allocate-memory-how-to-size-up-problem/3233/1 |
Hmm yeah I agree that scalability seems more important. Also I missed that this PR doesn't have a whatsnew entry - we should probably add one to help flag the regression. |
whats-new.rst