-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add vectorized "in" (.∈) and "notin" (.∉) #12406
Conversation
When you benchmark it is better to put code in a function. It avoids spurious results from evaluations in global scope. Compare: x = collect(1:1000);
@time for i in 1:1000 Bool[(xx ∈ 1:100) for xx in x] end
# 0.306228 seconds (5.96 M allocations: 152.985 MB, 6.32% gc time) function f()
x = collect(1:1000)
for i in 1:1000 Bool[(xx ∈ 1:100) for xx in x] end
end
@time f()
# 0.002487 seconds (1.01 k allocations: 1.045 MB) |
@KristofferC Wow, that made a huge difference! function bench_comprehension(n::Int)
x = collect(1:1000);
for i in 1:n Bool[(xx ∈ 1:100) for xx in x] end
end
@time bench_comprehension(10^6)
# 2.858 seconds (1000 k allocations: 1038 MB, 7.97% gc time) function bench_functor(n::Int)
x = collect(1:1000);
for i in 1:n x .∈ 1:100 end
end
@time bench_functor(10^6)
# 3.460 seconds (1000 k allocations: 1038 MB, 6.39% gc time) |
@@ -380,6 +380,10 @@ const ∈ = in | |||
∋(itr, x)= ∈(x, itr) | |||
∌(itr, x)=!∋(itr, x) | |||
|
|||
# vectorized ∈ and ∉ | |||
(.∈){T}(x::AbstractArray{T}, set) = [xx ∈ set for xx in x] | |||
(.∉){T}(x::AbstractArray{T}, set) = [xx ∉ set for xx in x] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should produce BitArrays
; this should also do broadcasting the way other vectorized ops do.
@StefanKarpinski It definitely makes sense, but I need some guidance to make it right. Unlike the common broadcasting case (e.g. Ideally, one would like to have Actually, even the proposed PR cannot discriminate "vector of elements vs single set" and "set vs single collection of sets" cases. |
I'm not a big fan of this. It's a bit confusing to vectorize operations that are already collection operations, and I'd rather not encourage more special-case vector notation (#8450). |
@JeffBezanson If #8450 would lead to some new syntax for an easy vectorization, it would be fantastic.
|
So the reaction to |
See old issue filed at #5212. |
Add vectorized "in" (
.∈
) and "not in" (.∉
) operators to be on par with R.This is quite handy in combination with
DataFrames
for dataset filtering.