Skip to content

Fix some incorrect SLASET/DLASET calls (includes backport of openblas#2778)#438

Merged
julielangou merged 6 commits into
Reference-LAPACK:masterfrom
h-vetinari:backport_openblas_2778
Sep 8, 2020
Merged

Fix some incorrect SLASET/DLASET calls (includes backport of openblas#2778)#438
julielangou merged 6 commits into
Reference-LAPACK:masterfrom
h-vetinari:backport_openblas_2778

Conversation

@h-vetinari

@h-vetinari h-vetinari commented Sep 4, 2020

Copy link
Copy Markdown
Contributor

@serguei-patchkovskii observed some errors and suggested some fixes in #425 & #429 that were picked up by @martin-frbg in OpenMathLib/OpenBLAS#2778. Serguei said he won't have time to prepare a backport-PR, and it seems (after ~3 weeks) neither did Martin.

I have left the commits from OpenMathLib/OpenBLAS#2778 intact; just added Suggested-By: Serguei Patchkovskii <Serguei.Patchkovskii@gmail.com> to reflect Martin's comment

Relates to #318
Closes #425
Closes #429

Edit: after some more explanation by @serguei-patchkovskii & tips by @martin-frbg, I investigated a bit more and I believe this PR now additionally:
Closes #318

Reference-LAPACK issue 429

Suggested-By: Serguei Patchkovskii <Serguei.Patchkovskii@gmail.com>
Reference-LAPACK issue 425 (and 318)

Suggested-By: Serguei Patchkovskii <Serguei.Patchkovskii@gmail.com>
Reference-LAPACK issue 425 (and 318)

Suggested-By: Serguei Patchkovskii <Serguei.Patchkovskii@gmail.com>

@h-vetinari h-vetinari left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stumbled over a difference in the suggestion from @serguei-patchkovskii and the implementation from @martin-frbg.

PTAL

Comment thread TESTING/EIG/cchkhb2stg.f Outdated
Comment on lines +683 to +684
CALL SLASET( 'Full', N, 1, ZERO, ZERO, SD, 1 )
CALL SLASET( 'Full', N, 1, ZERO, ZERO, SE, 1 )

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I compare the diff with the OP from #429, @serguei-patchkovskii actually suggested the following

Suggested change
CALL SLASET( 'Full', N, 1, ZERO, ZERO, SD, 1 )
CALL SLASET( 'Full', N, 1, ZERO, ZERO, SE, 1 )
CALL SLASET( 'Full', N, 1, ZERO, ZERO, SD, N )
CALL SLASET( 'Full', N, 1, ZERO, ZERO, SE, N )

Was it intentional to implement this differently @martin-frbg?

Comment thread TESTING/EIG/cchkhb2stg.f Outdated
@martin-frbg

martin-frbg commented Sep 4, 2020

Copy link
Copy Markdown
Collaborator

Intentional, and also the reason why I did not commit the PR immediately, and/or copy it here - as far as I (and valgrind) can tell, specifying "1" instead of "N" is consistent with current usage in the test code, while for a general case serguei-patchkovskii's assessment seems correct.

@serguei-patchkovskii

serguei-patchkovskii commented Sep 5, 2020

Copy link
Copy Markdown

Intentional, and also the reason why I did not commit the PR immediately, and/or copy it here - as far as I (and valgrind) can tell, specifying "1" instead of "N" is consistent with current usage in the test code, while for a general case serguei-patchkovskii's assessment seems correct.

@martin-frbg is absolutely correct: this is the current usage in the code. This usage is contrary to the letter of the standard, and should be corrected. Let me explain in detail.

The relevant fragments of lapack/SRC/slaset.f are:

  SUBROUTINE SLASET( UPLO, M, N, ALPHA, BETA, A, LDA )

  REAL               A( LDA, * )
  •    Set the leading m-by-n submatrix to ALPHA.
    
  •    DO 60 J = 1, N
          DO 50 I = 1, M
             A( I, J ) = ALPHA
    
    50 CONTINUE
    60 CONTINUE

In the calling routine, we have the call: CALL SLASET( 'Full', N, 1, ZERO, ZERO, SE, 1 ), where SE is declared as a 1-dimensional array, with the at least SE(1:N) index range. In SLASET, we are trying to set A(1:N,1:1) = ZERO. This is not pretty, but as far as I can tell is allowed under the Fortran sequence association rules for Fortran77-style calls.

Unfortunately, the matrix A in this call is declared as A(1:1,*) - so accessing the first index beyond 1 is a bounds violation, and explicitly invokes undefined behaviour. Therefore, I would claim that all calls to SLASET/DLASET with the argument LDA<M are incorrect - even if they are the current usage. While it might seem to work, it is not hard to imagine a standard-conforming implementation, where it will generate incorrect code (e.g. an implementation which relies on the callee performing copy-in/copy-out for array arguments will produce unexpected results).

If I had to guess why it wasn't spotted and corrected so far, I would speculate that this is an archaism in many compilers, which goes back to FORTRAN-IV. Before Fortran77 (which as far as I recall introduced the A(*) array bound declaration), the commonly-used extension was to declare arrays with an unknown upper bound was to give them with the upper bound of 1. For this reason, many Fortran compilers still treat the upper bound of "1" same as "*" in bounds checking, and do not flag a bounds violation.

So this code might have been semi-OK under the 60-year-old language definition - and has been grandfathered by compilers ever since. Quite possibly because every self-respecting Fortran compiler must be able to compile LAPACK :P

@h-vetinari

Copy link
Copy Markdown
Contributor Author

@serguei-patchkovskii Thanks for the explanation!

These have been grandfathered in by compilers based on an ancient
version of the language definition, but would invoke undefined
behaviour per the standard. For more details, see the following:
Reference-LAPACK#438 (comment)

Suggested-By: Serguei Patchkovskii <Serguei.Patchkovskii@gmail.com>
@codecov

codecov Bot commented Sep 6, 2020

Copy link
Copy Markdown

Codecov Report

Merging #438 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #438   +/-   ##
=======================================
  Coverage   83.24%   83.24%           
=======================================
  Files        1808     1808           
  Lines      170197   170197           
=======================================
  Hits       141681   141681           
  Misses      28516    28516           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9b136e8...bffac66. Read the comment docs.

The last parameter must satisfy `LDA >= max(1,M)`, where M is the
number of rows of the matrix, see
https://github.com/Reference-LAPACK/lapack/blob/v3.9.0/SRC/slaset.f#L95

Suggested-By: Serguei Patchkovskii <Serguei.Patchkovskii@gmail.com>
The last parameter must satisfy `LDA >= max(1,M)`, where M is the
number of rows of the matrix, see
https://github.com/Reference-LAPACK/lapack/blob/v3.9.0/SRC/dlaset.f#L95

Suggested-By: Serguei Patchkovskii <Serguei.Patchkovskii@gmail.com>
@h-vetinari h-vetinari changed the title Backport openblas#2778 Fix some incorrect SLASET/DLASET calls (includes backport of openblas#2778) Sep 6, 2020
@h-vetinari

Copy link
Copy Markdown
Contributor Author

@serguei-patchkovskii
The requirement that you bring up is also explicitly noted in the docstrings of SLASET & DLASET.

I grepped around a bit and found some more occurrences, which I fixed in the last two commits. However, I make no claims of exhaustiveness (in particular, I didn't check all cases where the LDA argument didn't show up due to a linebreak)

@h-vetinari

Copy link
Copy Markdown
Contributor Author

Looking also at #318 in particular, one would be tempted to actually implement the check for LDA >= max(1,M) in SLASET/DLASET (and throw with XERBLA if violated). Would that make sense?

Also CC @langou, since #318 is marked "Priority: High"

@martin-frbg

Copy link
Copy Markdown
Collaborator

@h-vetinari From the original report in #318 it seems running a testsuite build with -fcheck=all would be sufficient (and more effective than grep) for catching all current offenders.

@h-vetinari

Copy link
Copy Markdown
Contributor Author

@martin-frbg, thanks for the tip (although I needed to find 33258cb to understand where exactly to add -fcheck=all).

In any case, I ran a build in conda-forge/lapack-feedstock#32 with the current patch & -fcheck=all, and while there were some errors, none of them come from SLASET/DLASET anymore (I had first run -fcheck=all without the patch, and the bounds violations for SLASET/DLASET were raising errors).

For the other errors that were uncovered, I opened #440.

@julielangou

Copy link
Copy Markdown
Contributor

Thanks again Martin!

@martin-frbg

martin-frbg commented Sep 8, 2020

Copy link
Copy Markdown
Collaborator

all glory to h-vetinari and serguei-patchkovskii 😄

@h-vetinari h-vetinari deleted the backport_openblas_2778 branch September 8, 2020 10:10
@h-vetinari

Copy link
Copy Markdown
Contributor Author

It was a team effort! 💪 🙃

christoph-conrads pushed a commit to christoph-conrads/lapack that referenced this pull request May 23, 2021
These have been grandfathered in by compilers based on an ancient
version of the language definition, but would invoke undefined
behaviour per the standard. For more details, see the following:
Reference-LAPACK#438 (comment)

Suggested-By: Serguei Patchkovskii <Serguei.Patchkovskii@gmail.com>
christoph-conrads pushed a commit to christoph-conrads/lapack that referenced this pull request May 23, 2021
…blas_2778

Fix some incorrect SLASET/DLASET calls (includes backport of openblas#2778)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect calls to DLASET in TESTING/EIG/cchkhb2stg.f Incorrect call to SLASET in schkst2stg.f out of the array bound in SRC/slaset.f

4 participants