`vcor` along a dimension seems significantly slower than `cor` for a `Float64` matrix #30

jishnub · 2023-05-24T09:43:33Z

julia> using Statistics, VectorizedStatistics, BenchmarkTools

julia> a = rand(50, 50);

julia> @btime cor($a, dims=1);
  16.877 μs (12 allocations: 40.72 KiB)

julia> @btime vcor($a, dims=1);
  2.495 ms (11734 allocations: 285.98 KiB)

This seems to be due to type-instabilities in _vcor (snippet from Cthulhu):

_vcor(X, dims, corrected, multithreaded::Static.StaticBool) @ VectorizedStatistics ~/.julia/packages/VectorizedStatistics/4lfYf/src/vcov.jl:143
143 function _vcor(X, dims, corrected, multithreaded::StaticBool)
144     Tₒ::Core.Const(Float64) = Base.promote_op(/, eltype(X), Int)
145     n = size(X, dims)
146     m = size(X, mod(dims,2)+1)
147     Ρ = similar(X, Tₒ, (m, m))
148     # Diagonal must be unity
149     @inbounds for i = 1:m::Union{Nothing, Tuple{Int64, Int64}}
150         Ρ[i,i] = one(Tₒ)
151     end
152     # Only two dimensions are possible, so handle each manually
153     if dims == 1
154         # Precalculate means and standard deviations
155         μ = ntuple(m)::Tuple{Vararg{Float64}} do d
156             _vmean(view(X,:,d), :, multithreaded)
157         end
158         σ::Tuple = ntuple(m)::Tuple do d
159             _vstd(μ[d], corrected, view(X,:,d), :, multithreaded)
160         end
161         # Fill off-diagonals symmetrically
162         @inbounds for i = 1:m::Union{Nothing, Tuple{Int64, Int64}}
163             for j = 1:i::Union{Nothing, Tuple{Int64, Int64}}
164                 σᵢⱼ::Any = _vcov(view(X,:,i), view(X,:,j), corrected, μ[i], μ[j], multithreaded)::Any
165                 Ρ[i,j] = Ρ[j,i] = (σᵢⱼ::Any / ((σ::Tuple[i]::Any * σ::Tuple[j]::Any)::Any))::Any
166             end
167         end
168     elseif dims == 2
169         # Precalculate means and standard deviations
170         μ = ntuple(m)::Tuple{Vararg{Float64}} do d
171             _vmean(view(X,d,:), :, multithreaded)
172         end
173         σ::Tuple = ntuple(m)::Tuple do d
174             _vstd(μ[d], corrected, view(X,d,:), :, multithreaded)
175         end
176         @inbounds for i = 1:m::Union{Nothing, Tuple{Int64, Int64}}
177             for j = 1:i-1::Union{Nothing, Tuple{Int64, Int64}}
178                 σᵢⱼ::Any = _vcov(view(X,i,:), view(X,j,:), corrected, μ[i], μ[j], multithreaded)::Any
179                 Ρ[i,j] = Ρ[j,i] = (σᵢⱼ::Any / ((σ::Tuple[i]::Any * σ::Tuple[j]::Any)::Any))::Any
180             end
181         end
182     else
183         throw("Dimension not in range")
184     end
185     return Ρ
186 end

The text was updated successfully, but these errors were encountered:

chriselrod · 2023-05-24T15:42:19Z

I don't really have the time to look at this example to comment, but would discourage a solution that involves unnecessary allocations.
Replacing tuples with vectors does not sound like a good idea.

jishnub · 2023-05-24T15:49:44Z

I'm not certain that I agree with this in this specific scenario, as the Tuples allocated here may be excessively large (as large as one dimension of the matrix, so may be 1000s of elements long).

chriselrod · 2023-05-24T21:06:11Z

Creating tuples that large definitely sounds wrong. I don't know the code -- why do we need a vector or tuple of axes like this?

brenhinkeller · 2023-05-24T22:13:32Z

It took me a while to remember what I was originally doing here -- it's not the dims tuple, it's tuples of intermediate results (the means and standard deviations of each row or column of the matrix).. I was trying to avoid an allocation but that was perhaps a mistake

jishnub mentioned this issue May 24, 2023

Improve type-stability in _vcor #31

Merged

brenhinkeller closed this as completed in #31 May 25, 2023

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`vcor` along a dimension seems significantly slower than `cor` for a `Float64` matrix #30

`vcor` along a dimension seems significantly slower than `cor` for a `Float64` matrix #30

jishnub commented May 24, 2023

chriselrod commented May 24, 2023 •

edited

Loading

jishnub commented May 24, 2023

chriselrod commented May 24, 2023 •

edited

Loading

brenhinkeller commented May 24, 2023 •

edited

Loading

vcor along a dimension seems significantly slower than cor for a Float64 matrix #30

vcor along a dimension seems significantly slower than cor for a Float64 matrix #30

Comments

jishnub commented May 24, 2023

chriselrod commented May 24, 2023 • edited Loading

jishnub commented May 24, 2023

chriselrod commented May 24, 2023 • edited Loading

brenhinkeller commented May 24, 2023 • edited Loading

`vcor` along a dimension seems significantly slower than `cor` for a `Float64` matrix #30

`vcor` along a dimension seems significantly slower than `cor` for a `Float64` matrix #30

chriselrod commented May 24, 2023 •

edited

Loading

chriselrod commented May 24, 2023 •

edited

Loading

brenhinkeller commented May 24, 2023 •

edited

Loading