Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust initialization in maximum and minimum #27845

Merged
merged 5 commits into from
Jul 13, 2018
Merged

Conversation

andreasnoack
Copy link
Member

Fixes #27836

@andreasnoack andreasnoack force-pushed the anj/reducediminit branch 7 times, most recently from 31c67c2 to cae4be0 Compare June 28, 2018 15:40
reducedim_init(f, op::typeof(min), A::AbstractArray{T}, region) where {T} = reducedim_initarray0(A, region, f, minimum)
# initialization when computing minima and maxima requires a little care
for (f1, f2, initval) in ((min, max, Inf), (max, min, -Inf))
function reducedim_init(f, op::typeof(f1), A::AbstractArray, region)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs an @eval with interpolation — I don't think this will work as it stands.

function reducedim_init(f, op::typeof(f1), A::AbstractArray, region)
# First compute the reduce indices. This will throw an ArgumentError
# if any region is invalid
ri = Base.reduced_indices(A, region)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Base.

@mbauman
Copy link
Member

mbauman commented Jun 28, 2018

This seems like a good incremental improvement, but boy does this need a bigger revamp here. It's somewhat nice that only a very limited set of functions are supported right now, because when we go to support arbitrary reduction functions we'll need to change the requirements — we'll need to require two slices along the dimension.

@andreasnoack
Copy link
Member Author

andreasnoack commented Jun 28, 2018

I'd hope that the arrays initialization code could be cleaned up more generally. It is quite convoluted. Regarding your comment about supporting more reducers then an alternative to using two slices might be that users who happen to have a new op are asked to provide a function that returns a y such that for all x then op(x, y) == x. We currently utilize that these are zero, one, and identity for +, * and min respectively so it might work more generally.

Back to the current PR. There is some funny business going on with OffsetArrays. In the test, there is this array

julia> A = OffsetArray(rand(4,4), (-3,5));

What seems wrong to me is how reduced_indices work here since

julia> axes(A)
(Base.Slice(-2:1), Base.Slice(6:9))

julia> Base.reduced_indices(A, 1:2)
(Base.Slice(1:1), Base.Slice(1:1))

Notice that the last slice is not included in Slice(6:9). Is this expected?

Finally, while trying to call collection functions on a Slice it is also clear how much we still rely on size in Base.

Update: Maybe the OffsetArray is a bit too picky about size. Indeed many uses of size should be axes but size is still meaningful for the number of elements in each direction which is sometimes what you want.

@mbauman
Copy link
Member

mbauman commented Jun 28, 2018

Is this expected?

As is often the case with OffsetArrays, we need to choose which aspect of the Array behavior is more important when considering the resulting axis of a reduced dimension: Do we collapse to 1:1 or do we collapse to the first index in the reduced dimension? We've chosen the former, which has the unfortunate side-effect that the reduced_indices are not a slice into the original object as you're trying to use it.

@andreasnoack
Copy link
Member Author

It was a bit more tricky to finish up than expected but I think it works now. I've changed the non-OneTo case to return Slices with the first index instead of OneTo(1). I think it makes more sense. If we stick with the current behavior when we'd need a different fix for the initialization.

return copy(A1)
else
# otherwise use the min/max of the first slice as initial value
v0 = mapreduce(f, $f2, A1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem to be correct to use the min/max of the first slice as the initial value for all slices. I'm probably missing something. (I wish I could have used something simpler than what I did for this at #28027).

BTW, typo below: "intial"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be fine. If you are computing the maximum along a dimension then it should be fine to initialize with any value less than or equal to the first element in the slice. The minimum over the first slice will satisfy this, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I misunderstood "first slice". I thought it referred to the first slice along which reduction is performed. Carry on.

v0 = mapreduce(f, $f2, A1)

# but NaNs need to be avoided as intial values
v0 = v0 != v0 ? typeof(v0)($initval) : v0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT the mapreduce call above will give NaN whenever one of the values is NaN. So this code is going to use the fallback initval even if there are non-NaN values which could be used. Maybe it would be better to use a custom loop to go over the first slice and skip NaN values?

Actually, maybe something similar to the mapfirst! method I had to write at #27845 to skip missing values is needed.

Anyway, a few tests for these corner cases would be good. ;-)

Copy link
Member Author

@andreasnoack andreasnoack Jul 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically a check for floats but tries to be a little more generic. Using the initval is fine for floats and only floats will probably ever be NaN satisfy the test so I don't think there is an issue here.

@andreasnoack
Copy link
Member Author

@timholy Are you okay with the reductions-on-offset-arrays behavior here?

@timholy
Copy link
Member

timholy commented Jul 12, 2018

Yes, it's quite elegant! Sorry I didn't see this earlier, thanks for tackling it.

@andreasnoack andreasnoack merged commit 1859a91 into master Jul 13, 2018
@andreasnoack andreasnoack deleted the anj/reducediminit branch July 13, 2018 06:36
andreasnoack added a commit that referenced this pull request Jul 13, 2018
andreasnoack added a commit that referenced this pull request Jul 13, 2018
@@ -3,28 +3,41 @@
## Functions to compute the reduced shape

# for reductions that expand 0 dims to 1
reduced_index(i::OneTo) = OneTo(1)
reduced_index(i::Slice) = Slice(first(i):first(i))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like simply making this a UnitRange instead of a Slice will do the trick — make -C test reduce reduced offsetarray passes for me locally with that change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants