Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

note on getf2 (unblocked LU factorization) #4174

Closed
angsch opened this issue Jul 31, 2023 · 5 comments
Closed

note on getf2 (unblocked LU factorization) #4174

angsch opened this issue Jul 31, 2023 · 5 comments

Comments

@angsch
Copy link
Contributor

angsch commented Jul 31, 2023

https://github.com/xianyi/OpenBLAS/blob/09131f79a6801b9e7ff9dfacdbcbb5acd0a7c86d/lapack/getf2/getf2_k.c#L97-L101
I find the if (jp>m) statement confusing and think that line 98 should be removed.

  • IAMAX_K returns a value between 0 and m-j-1 (inclusive). Hence, jp = j + IAMAX_K(m - j, b + j, 1); evaluates to jp = m-1 as the largest index. In other words, the condition of the if is not reachable.
  • If if (jp>m) was ever executed, then setting jp = m would result in accessing an illegal address in line 101, temp1 = b[m]. b is the column pointer with m rows, ie b[m-1] is the largest valid index.
@martin-frbg
Copy link
Collaborator

git blame provides the context - #723 , papering over a bug in the iamax kernel that produced NaNs and other nasties (at the time only ?), and xianyi possibly taking my suggestion too literally.

@martin-frbg
Copy link
Collaborator

Obviously you're right about setting jp=m not being much saner, I guess it is just a bit safer to go one element beyond than to grab unrelated memory further away. My impulse reaction would be to "correct" that to j=m-1 and push the skeleton back into the closet for now. (In the 0.2.1x timeframe, most x86_64 assembly kernels were not saving and restoring registers correctly, among other things... well worth revisiting, but I'd rather get 0.3.24 released first)

@angsch
Copy link
Contributor Author

angsch commented Jul 31, 2023

Thanks, Martin, for digging out the old issue. As far as I am aware, the current behaviour of IAMAX in reference-LAPACK is flawed in the presence of NaN (if the vector contains NaNs, the index of the first NaN should be returned, which is not the case right now). This will likely be fixed as part of the ongoing Inf and NaN propagation effort.

Perhaps an assert/hard-coded check (jp >= j && jp < m) for a valid index + a comment why this line is there is a better choice to catch future bugs.

@martin-frbg
Copy link
Collaborator

assert is a bit drastic for a low-level library, and I won't promise that OpenBLAS will comply with the reference implementation's present or future handling of NaNs. I'm pretty sure every implementation handles this differently, so IAMAX with NaNs is essentially invoking undefined behaviour.

@martin-frbg
Copy link
Collaborator

revisiting this, I come to the conclusion that (1) jp can indeed legally be m (start of loop on j, j=0, IAMAX(m-0,...) ). As jp is decremented before the temp1 assignment, the index of b in that line stays within limits. (2) a complete fix (that survives even the pathological case of an all-NaN array) would require all IAMAX kernels to return the index of the first NaN in the file, and even then it would seem sensible to keep the test in case a newly added kernel had the same flaw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants