-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
type-stable inner loop for sqrtm #20214
Changes from 3 commits
abe1129
a18f8da
a538250
5718a1c
60fcb28
ce830fe
a7e7ce8
075fd38
03ef1b4
7b990f1
e3cf0a9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1867,7 +1867,16 @@ function logm{T<:Union{Float64,Complex{Float64}}}(A0::UpperTriangular{T}) | |
end | ||
logm(A::LowerTriangular) = logm(A.').' | ||
|
||
function sqrtm{T}(A::UpperTriangular{T}) | ||
function floop(x,R,i::Int,j::Int) | ||
r = x | ||
@inbounds begin | ||
@simd for k = i+1:j-1 | ||
r -= R[i,k]*R[k,j] | ||
end | ||
end | ||
r | ||
end | ||
function sqrtm(A::UpperTriangular) | ||
n = checksquare(A) | ||
realmatrix = false | ||
if isreal(A) | ||
|
@@ -1879,19 +1888,20 @@ function sqrtm{T}(A::UpperTriangular{T}) | |
end | ||
end | ||
end | ||
sqrtm(A,Val{realmatrix}) | ||
end | ||
function sqrtm{T,realmatrix}(A::UpperTriangular{T},::Type{Val{realmatrix}}) | ||
if realmatrix | ||
TT = typeof(sqrt(zero(T))) | ||
else | ||
TT = typeof(sqrt(complex(-one(T)))) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. just use |
||
end | ||
n = checksquare(A) | ||
R = zeros(TT, n, n) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ~~My inclination would be to use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, nevermind, I forgot that this is for the upper-triangular case. Then I guess initializing it to zero makes sense. |
||
for j = 1:n | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems like we should just have There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can put There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done both 👍 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can just put the @inbounds for j = 1:n
# loop body
end |
||
R[j,j] = realmatrix?sqrt(A[j,j]):sqrt(complex(A[j,j])) | ||
R[j,j] = realmatrix ? sqrt(A[j,j]) : sqrt(complex(A[j,j])) | ||
for i = j-1:-1:1 | ||
r = A[i,j] | ||
for k = i+1:j-1 | ||
r -= R[i,k]*R[k,j] | ||
end | ||
r = floop(A[i,j],R,i,j) | ||
r==0 || (R[i,j] = r / (R[i,i] + R[j,j])) | ||
end | ||
end | ||
|
@@ -1900,14 +1910,10 @@ end | |
function sqrtm{T}(A::UnitUpperTriangular{T}) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems like we could replace this with sqrtm{T}(A::UnitUpperTriangular{T}) = sqrtm(A, Val{true}) in terms of the abovementioned method. Given a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will look at this tomorrow There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since
For some reason this is faster than the present method, which confuses me. This version involves two additional conversions and some unnecessary arithmetic operations in the loop (of order O(n) as you mention)! Are some ops on Will make this change later unless someone can enlighten me There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (Julia doesn't support inheritance of concrete types.) Indexing expressions Since by construction you only access the upper triangle, I would do: B = A.data at the beginning of the function (for both the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (There might be other methods operating on triangular matrix types that might benefit from a similar change.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The Timing such a small operation is tricky; I would normally recommend BenchmarkTools instead of julia> using BenchmarkTools
julia> A = rand(10,10); T = UpperTriangular(A);
julia> @btime $A[3,5]; @btime $T[3,5];
1.377 ns (0 allocations: 0 bytes)
1.650 ns (0 allocations: 0 bytes) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (It may be that the effect on the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. After improving the type stability, the specialised implementation for
(what does the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How did you unwrap the type? I hope you didn't do I can't reproduce your benchmark results. Maybe the benchmarks aren't reliable for such a short operation, and one needs to put a bunch of indexing operations in a loop? (Probably you also want to time them with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For example: julia> function sumupper(A)
s = zero(eltype(A))
for j = 1:size(A,2), i = 1:j
s += A[i,j]
end
return s
end
sumupper (generic function with 1 method)
julia> A = rand(100,100); T = UpperTriangular(A);
julia> using BenchmarkTools
julia> sumupper(A) == sumupper(T)
true
julia> @btime sumupper($A); @btime sumupper($T);
4.122 μs (0 allocations: 0 bytes)
5.102 μs (0 allocations: 0 bytes) |
||
n = checksquare(A) | ||
TT = typeof(sqrt(zero(T))) | ||
R = zeros(TT, n, n) | ||
R = eye(TT, n, n) | ||
for j = 1:n | ||
R[j,j] = one(T) | ||
for i = j-1:-1:1 | ||
r = A[i,j] | ||
for k = i+1:j-1 | ||
r -= R[i,k]*R[k,j] | ||
end | ||
r = floop(A[i,j],R,i,j) | ||
r==0 || (R[i,j] = r / (R[i,i] + R[j,j])) | ||
end | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not type stable if
A[i,j]
andR[i,j]
are not the same type. A simple fix isr = x + zero(eltype(R))