-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression from findall
depwarn suggestion
#26183
Comments
The difference here is that findall(testf::Function, A) = collect(first(p) for p in pairs(A) if testf(last(p))) We may want to add some heuristics here — perhaps a |
It's very hard to decide in full generality what the density of true values is going to be. It could make sense to add a Adding a counting pass could be efficient for functions known to be cheap, like There are also some chances that the generic implementation could be made faster by improving I'm surprised by the timings you posted. On 0.7 the two first approaches take about the same time (though the number of allocations is lower with |
If you sample a random set of points in the matrix then you do actually get an accurate estimate regardless of matrix structure. Doing that would require evaluating points out of order, however. |
findall
depwarn suggestion
I don't see the reported performance problem anymore, but there seems to be a regression from 1.7-> master: # on Julia 1.7.1
julia> A = rand(10_000, 10_000);
julia> @btime findall(x->x != 0, A);
476.016 ms (8 allocations: 1.50 GiB)
julia> @btime findall(!iszero, A);
470.884 ms (8 allocations: 1.50 GiB)
julia> @btime findall((!iszero).(A));
469.952 ms (9 allocations: 1.50 GiB) and #master Version 1.8.0-DEV.1526 (2022-02-13)
julia> A = rand(10_000, 10_000);
julia> @btime findall(x->x != 0, A);
505.472 ms (8 allocations: 1.50 GiB)
julia> @btime findall(!iszero, A);
499.844 ms (8 allocations: 1.50 GiB)
julia> @btime findall((!iszero).(A));
497.787 ms (9 allocations: 1.50 GiB) |
Duplicate of #42187 |
I realize this is the older issue, but this function has changed several time since, so I think that is the more accurate one |
Prompted by this deprecation warning:
I decided to do a (n admittedly worst-case) test:
A = rand(10_000, 10_000);
Had I taken the recommended approach, I would've run into a pretty bad performance regression. As it stands, the best (time) performing approach is among the worst in memory usage; the recommended approach with the anonymous function is bad in time but best in memory, and the negative function is worst in both.
The text was updated successfully, but these errors were encountered: