Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR implements the AXPBY operation
which extends the
axpy
operation by the second scaling factor, just like ingemm
orgemv
.This is required to reduce the memory transfers in algorithms like the CG algorithm, where one step is
Until now, this needs to be implemented in one
scal
and oneaxpy
step. The introduction of theaxpby
routine allows to read and writep_{k+1}
only once from the memory. In other iterative algorithms, like BiCGStab, the subroutine can be used as well.The routine already exists, for example, in
Checklist