-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
support dask arrays in rolling computations using bottleneck functions #1568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
xarray/core/ops.py
Outdated
| return result | ||
|
|
||
|
|
||
| def dask_rolling_wrapper_without_min_count(moving_func, a, window, axis=-1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we use the default parameter value rather than writing this twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we update the minimum support bottleneck version to 1.1, we can do away with this special case (also present for the non-dask case) all together.
xarray/core/ops.py
Outdated
| has_bottleneck = False | ||
|
|
||
| try: | ||
| import dask.array as da |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved he dask array related code to duck_array_ops. That's probably a more natural place for this.
Actually: maybe we should have another module for dask array operations, and then put the duck type functions in dask_array_ops.
|
That works for me. But note that you can also use **kwargs to insert
conditional keyword arguments.
…On Mon, Sep 11, 2017 at 10:01 PM Joe Hamman ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In xarray/core/ops.py
<#1568 (comment)>:
> + if axis < 0:
+ axis = a.ndim + axis
+ depth = {d: 0 for d in range(a.ndim)}
+ depth[axis] = window - 1
+ boundary = {d: np.nan for d in range(a.ndim)}
+ # create ghosted arrays
+ ag = da.ghost.ghost(a, depth=depth, boundary=boundary)
+ # apply rolling func
+ out = ag.map_blocks(moving_func, window, min_count=min_count,
+ axis=axis, dtype=a.dtype)
+ # trim array
+ result = da.ghost.trim_internal(out, depth)
+ return result
+
+
+def dask_rolling_wrapper_without_min_count(moving_func, a, window, axis=-1):
If we update the minimum support bottleneck version to 1.1, we can do away
with this special case (also present for the non-dask case) all together.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1568 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABKS1gpgrbD_70fLL77G1RJClRZQbXSUks5shhANgaJpZM4PUFEa>
.
|
|
I went ahead and changed the bottleneck version. It really cleans up the all the rolling methods and the testing so I think this is an overall nice change. |
xarray/core/ops.py
Outdated
| try: | ||
| import bottleneck as bn | ||
| if LooseVersion(bn.__version__) < LooseVersion('1.1'): | ||
| raise ImportError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a descriptive error message here?
xarray/tests/test_dataarray.py
Outdated
| rolling_obj = da_dask.load().rolling(time=7, min_periods=min_periods) | ||
| expected = getattr(rolling_obj, name)() | ||
|
|
||
| # using all-close becuase rolling over ghost cells introduces some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because
xarray/tests/test_dataarray.py
Outdated
| @pytest.mark.parametrize('min_periods', (1, None)) | ||
| def test_rolling_wrapped_bottleneck_dask(da_dask, name, center, min_periods): | ||
| pytest.importorskip('dask.array') | ||
| pytest.importorskip('bottleneck', minversion="1.1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't these tests now work regardless of whether bottleneck is installed? (obviously performance will be better with bottleneck)
xarray/core/ops.py
Outdated
| try: | ||
| import bottleneck as bn | ||
| if LooseVersion(bn.__version__) < LooseVersion('1.1'): | ||
| warnings.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we trigger this warning lazily (e.g., when calling actually resample)? If we do this at import time we are going to get a lot of complaints from users using bottleneck < 1.1 about irrelevant warnings.
@shoyer - I'm not exactly sure how to do this consistently throughout the package. In d532a1f
I'm actually leaning toward 2 right now. We don't enforce strict minimum versions on other dependencies so it seems like what we're trying to do may be a bit overkill. |
Sounds good to me. |
|
any final comments here? |
git diff upstream/master | flake8 --diffwhats-new.rstfor all changes andapi.rstfor new APIcc @shoyer, @darothen, @arbennett, @vnoel