Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add count(itr) and throw and error in count if non-boolean values are encountered #20421

Merged
merged 2 commits into from
Feb 3, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,8 @@ This section lists changes that do not have deprecation warnings.
that internally uses twice-precision arithmetic. These two
outcomes exhibit differences in both precision and speed.

* The `count` function no longer sums non-boolean values ([#20404])

Library improvements
--------------------

Expand Down Expand Up @@ -231,6 +233,8 @@ Library improvements

* Additional methods for `ones` and `zeros` functions to support the same signature as the `similar` function ([#19635]).

* `count` now has a `count(itr)` method equivalent to `count(identity, itr)` ([#20403]).

* Methods for `map` and `filter` with `Nullable` arguments have been
implemented; the semantics are as if the `Nullable` were a container with
zero or one elements ([#16961]).
Expand Down
1 change: 1 addition & 0 deletions base/bitarray.jl
Original file line number Diff line number Diff line change
Expand Up @@ -1575,6 +1575,7 @@ function countnz(B::BitArray)
end
return n
end
count(B::BitArray) = countnz(B)

# returns the index of the next non-zero element, or 0 if all zeros
function findnext(B::BitArray, start::Integer)
Expand Down
16 changes: 15 additions & 1 deletion base/reduce.jl
Original file line number Diff line number Diff line change
Expand Up @@ -642,21 +642,35 @@ end

"""
count(p, itr) -> Integer
count(itr) -> Integer

Count the number of elements in `itr` for which predicate `p` returns `true`.
If `p` is omitted, counts the number of `true` elements in `itr` (which
should be a collection of boolean values).

```jldoctest
julia> count(i->(4<=i<=6), [2,3,4,5,6])
3

julia> count([true, false, true, true])
3
```
"""
function count(pred, itr)
n = 0
for x in itr
n += pred(x)
n += pred(x)::Bool
end
return n
end
function count(pred, a::AbstractArray)
n = 0
for i in eachindex(a)
@inbounds n += pred(a[i])::Bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this safe for arbitrary preds and AbstractArrays? Could, for example, pred resize a::Vector as a side effect, causing issues downstream in the iteration? Best!

Copy link
Member Author

@stevengj stevengj Feb 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to use @inbounds for all of the other mapreduce functions, so they are all assuming that the mapped function does not resize the array. Maybe we should reconsider that elsewhere, but I don't think that count should be the exception here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, interesting. On the one hand, from base/reduce.jl _mapreduce, mapreduce_impl, mapfoldl_impl, and sum_kbn use @inbounds. On the other hand, mapfoldr_impl, any, all, contains, extrema, and the existing implementation of count do not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a reasonable assumption to me. Better turn on @inbounds everywhere possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ref. related discussion #19925 (review). Perhaps consistency of @inbounds decoration in reductions and similar functions deserves a dedicated issue? Best!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeffBezanson seemed to imply that @inbounds was only OK for Array, for which we know the index is valid. Indeed, for a custom AbstractArray, a buggy implementation could lead to crashes with @inbounds. We should definitely have a policy about this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a policy: only use @inbounds when you can be certain, from local information, that all accesses are in bounds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that means the signature needs to be changed to Array? We cannot be certain that eachindex(a) is correct for any custom type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe another PR should go through and fix occurrences of @inbounds in Base for AbstractArray?

It would be nice to have a way to turn this on for Array without requiring two methods. See also #15291

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened #20469 so that we don't forget about this.

end
return n
end
count(itr) = count(identity, itr)

"""
countnz(A) -> Integer
Expand Down
1 change: 1 addition & 0 deletions base/sparse/sparsematrix.jl
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ julia> nnz(A)
"""
nnz(S::SparseMatrixCSC) = Int(S.colptr[end]-1)
countnz(S::SparseMatrixCSC) = countnz(S.nzval)
count(S::SparseMatrixCSC) = count(S.nzval)

"""
nonzeros(A)
Expand Down
1 change: 1 addition & 0 deletions base/sparse/sparsevector.jl
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ length(x::SparseVector) = x.n
size(x::SparseVector) = (x.n,)
nnz(x::SparseVector) = length(x.nzval)
countnz(x::SparseVector) = countnz(x.nzval)
count(x::SparseVector) = count(x.nzval)
nonzeros(x::SparseVector) = x.nzval
nonzeroinds(x::SparseVector) = x.nzind

Expand Down
14 changes: 12 additions & 2 deletions test/reduce.jl
Original file line number Diff line number Diff line change
Expand Up @@ -260,8 +260,18 @@ immutable SomeFunctor end

# count & countnz

@test count(x->x>0, Int[]) == 0
@test count(x->x>0, -3:5) == 5
@test count(x->x>0, Int[]) == count(Bool[]) == 0
@test count(x->x>0, -3:5) == count((-3:5) .> 0) == 5
@test count([true, true, false, true]) == count(BitVector([true, true, false, true])) == 3
@test_throws TypeError count(sqrt, [1])
@test_throws TypeError count([1])
let itr = (x for x in 1:10 if x < 7)
@test count(iseven, itr) == 3
@test_throws TypeError count(itr)
@test_throws TypeError count(sqrt, itr)
end
@test count(iseven(x) for x in 1:10 if x < 7) == 3
@test count(iseven(x) for x in 1:10 if x < -7) == 0

@test countnz(Int[]) == 0
@test countnz(Int[0]) == 0
Expand Down
1 change: 1 addition & 0 deletions test/sparse/sparse.jl
Original file line number Diff line number Diff line change
Expand Up @@ -806,6 +806,7 @@ end
FI = Array(I)
@test sparse(FS[FI]) == S[I] == S[FI]
@test sum(S[FI]) + sum(S[!FI]) == sum(S)
@test countnz(I) == count(I)

sumS1 = sum(S)
sumFI = sum(S[FI])
Expand Down
2 changes: 2 additions & 0 deletions test/sparse/sparsevector.jl
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ let x = spv_x1
@test nonzeros(x) == [1.25, -0.75, 3.5]
end

@test count(SparseVector(8, [2, 5, 6], [true,false,true])) == 2

# full

for (x, xf) in [(spv_x1, x1_full)]
Expand Down