-
Notifications
You must be signed in to change notification settings - Fork 441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New algorithms for computing Givens rotations #631
New algorithms for computing Givens rotations #631
Conversation
Codecov Report
@@ Coverage Diff @@
## master #631 +/- ##
=======================================
Coverage 0.00% 0.00%
=======================================
Files 1894 1894
Lines 184021 184035 +14
=======================================
- Misses 184021 184035 +14
Continue to review full report at Codecov.
|
SRC/clartg.f90
Outdated
c = (1 / sqrt( one + g2/f2 )) * w | ||
else | ||
c = ( f2*p )*w | ||
end if | ||
c = ( f2*p )*w |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weslleyspereira: I think you want to remove line 236.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I forgot it. Thanks!
From Jim: Do we know if the new clartg is consistent with what we proposed in section 2.3.5 of our exception handling document? Might be nice to avoid having yet another version in the future (or at least know what we need to change). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let us wait on feedback from Sergey though.
…eira/lapack into fix-precision-in-clartgf90-2
I have just updated the description (first comment) of this PR. @langou and I are preparing a report with the numerical analysis and experiments comparing the different algorithms for computing Givens rotations. We will share the document here when it is ready to review. |
And here is the report: https://arxiv.org/abs/2211.04010. |
This incorporates modifications introduced in LAPACK 3.11.0 in Reference-LAPACK/lapack#631
This incorporates modifications introduced in LAPACK 3.11.0 in Reference-LAPACK/lapack#631
Closes #629
New algorithms for computing Givens rotations
@sergey-v-kuznetsov highlighted in #629 that the new Givens rotations operations may have lower accuracy than the ones that were in LAPACK up to release 3.9. This was verified after applying several rotations to a initial unitary matrix. This PR proposes:
Both modifications target the improvement of the output's accuracy.
@langou and I are preparing a report with the numerical analysis and experiments comparing the different algorithms for computing Givens rotations. We will share the document here when it is ready to review.
Minor modifications
rtmin = sqrt( safmin )
instead ofrtmin = sqrt( safmin / epsilon )
. The first condition is sufficient to guarantee all real variables used in the intermediate steps of the new algorithm belong to the interval[safmin,safmax]
.rtmax
to eithersqrt( safmin/4 )
,sqrt( safmin/2 )
orsqrt( safmin )
. This variable depends on where it is in the algorithm. The value is the maximum possible in order that all real variables used in the intermediate steps of the new algorithm belong to the interval[safmin,safmax]
.p = one / d
,uu = one / u
andvv = one / v
. These operations reduce the number of divisions in the code at the cost of possibly increasing the accumulation error. I am trying to improve accuracy, so I remove the intermediate operations at the cost of having additional floating-point divisions.f = 0
, check ifreal(g) == 0
oraimag(g) == 0
to avoid unnecessaryABSSQ( g ); sqrt( g2 )
. This change reduces the accumulation error whenreal(g) == 0
(analogouslyaimag(g) == 0
) andaimag(g)**2
(analogouslyreal(g)**2
) cannot be stored in the respective finite precision. We choose not to use the intrinsic complexabs
because its implementation is compiler-dependent.Major changes
The algorithm for computing complex Givens rotations was revisited. This is the new code in
(c,z)ROTG
and(c,z)LARTG
for the unscaled part:[safmin,safmax]
.Acknowledgements
Thanks to people that contributed in the discussions about this code:
Checklist