Implement xGEMMTR and cblas_xGEMMTR #887

grisuthedragon · 2023-07-24T15:36:12Z

Description

The xGEMMT function implements the GEMM operation but only the lower or the upper triangle of C is accessed. The routine is provide by many optimized BLAS libraries but not by the reference implementation. Due to the fact, that projects like MUMPS used it makes it necessary to provide this function in the reference implementation. This enables projects, that used this function to in runtime environments that allow the exchange of the BLAS libraries, like update-alternative on Debian/Ubuntu or FlexiBLAS.

The xGEMMT function are provided for example by:

The PR includes subroutines, the tests, and the CBLAS interface.

EDIT After the discussion, the routine is now called xGEMMTR.

Checklist

The documentation has been updated.
CBLAS interface added
Tests for BLAS and CBLAS added.

codecov · 2023-07-24T15:45:58Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 0.00%. Comparing base (69992ad) to head (f90c7ae).
Report is 8 commits behind head on master.

❗ Current head f90c7ae differs from pull request most recent head c57c156

Please upload reports for the commit c57c156 to get more accurate results.

Additional details and impacted files

@@            Coverage Diff            @@
##           master     #887     +/-   ##
=========================================
  Coverage    0.00%    0.00%             
=========================================
  Files        1937     1918     -19     
  Lines      190566   188614   -1952     
=========================================
+ Misses     190566   188614   -1952

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

langou · 2023-07-24T19:34:07Z

Thanks Martin.

Two parts. (1) overall in favor. (2) rant against this specific GEMMT routine. To help me: who knows usage of GEMMT?

(1) Oh my. Adding a routine in the reference Level 3 BLAS? I am overall in favor since GEMMT already exists in many BLAS implementations. And I am happy to see Level 3 BLAS evolves and adapts.

(2-1) I must say that overall I am not a big fan of this GEMMT routine. I wonder why people need it. It seems popular enough. In many libraries. (Thanks for the survey.) But why do users need it? What are some examples of usage?

(2-2) I would rather see a routine like C <- alpha * A * D * A^T + beta * C where C is n-by-n symmetric in input and D is m-by-m symmetric in input too. (And so A is n-by-m.) That would be useful. In the BLAS Technical Forum Standard, they had BLAS_xSY_TRIDIAG_RK where, in addition of D being symmetric, the matrix D was symmetric tridiagonal. This BLAS_xSY_TRIDIAG_RK routine would be useful too.

(2-3) I wonder whether users need C <- alpha * A * D * A^T + beta * C and then they first perform GEMM with B <= 1. * A * D + 0. * B and then GEMMT with C <- alpha * A * B + beta * C. Well, if that's the case, it would be really, in my opinion, much better to use C <- alpha * A * D * A^T + beta * C.

(2-4) If we use C <- alpha * A * D * A^T + beta * C, we can reduce data movement from A and A^T by "combining" the data movement of A and A^T as shown in
https://doi.org/10.1145/3490148.3538587
https://doi.org/10.1109/SC41404.2022.00034
https://doi.org/10.1145/3558481.3591072
as opposed to use a GEMM follows by a GEMMT.

(2-5) Also I do not like the fact that the symmetry of A*B^T relies on what A and B are. Argh. I do not like this. When we look at the interface C <- alpha * A * D * A^T + beta * C, C in input is 100% symmetric because we assume the upper part from the lower part, D in input is 100% symmetric because we assume the upper part from the lower part. So the output C is 100% symmetric. There is no arguing. C in output is symmetric. When we look at C <- alpha * A * B + beta * C. Argh. The symmetry of output C relies on the "quality" of A and the "quality" of B and the "quality" of A*B. I think there is more uncertainty on whether A*B^T or B^T * A is the same. It should be the same. Is it? With the GEMMT interface, it is far from clear which one you should use. Argh. I think this is error prone, and not good for stability. I do not have good examples. But I do not like it.

(2-6) Speaking of C <- alpha * A * D * A^T + beta * C, I am not sure what a reference algorithm would be. Mmm. Is there a "more stable way" to perform A * D * A^T?

(2-7) One argument that I can see is for GEMMT is that the users as a fast way to right-multiply symmetric matrix D by A. So, the user wants to do C <- alpha * A * D * A^T + beta * C, but they have a fast way to do B <= A * D. So then yes, GEMMT can be useful.

langou · 2023-07-24T19:44:25Z

And let me add. The name. GEMMT. I understand that this is institutionalized in several BLAS libraries. Fine with me. But I find GEMMT a very cryptic name. I would have not have guessed the functionality from the GEMMT name.

martin-frbg · 2023-07-25T07:35:55Z

I share the doubts about adding industry standard extensions to what is (or used to be?) seen as the reference implementation of something that was largely agreed on by working groups decades ago. Though to some extent that would hold for Reference-LAPACK as well - is it the classical reference if old algorithms get upgraded and new ones added, or is it the new and improved UCDenver LAPACK ?

grisuthedragon · 2023-07-25T07:50:45Z

(2-1) I must say that overall I am not a big fan of this GEMMT routine. I wonder why people need it. It seems popular enough. In many libraries. (Thanks for the survey.) But why do users need it? What are some examples of usage?

At least some library using it, is MUMPS 5.6.1 in src/{s,c,d,z}fac_front_aux.F. Doing a short research, it shows, that they are using this in an LDL^T factorization, updating the L matrices. Thus, this seems to the C <- alpha * A * D * A^T + beta * C approach.

(2-2) I would rather see a routine like C <- alpha * A * D * A^T + beta * C where C is n-by-n symmetric in input and D is m-by-m symmetric in input too. (And so A is n-by-m.) That would be useful. In the BLAS Technical Forum Standard, they had BLAS_xSY_TRIDIAG_RK where, in addition of D being symmetric, the matrix D was symmetric tridiagonal. This BLAS_xSY_TRIDIAG_RK routine would be useful too.

IMHO, only allowing D to be a symmetric tridiagonal is a too large restriction.

(2-3) I wonder whether users need C <- alpha * A * D * A^T + beta * C and then they first perform GEMM with B <= 1. * A * D + 0. * B and then GEMMT with C <- alpha * A * B + beta * C. Well, if that's the case, it would be really, in my opinion, much better to use C <- alpha * A * D * A^T + beta * C.

For the most application cases I could think off, the combined variant would be better. For example, when solving a Lyapunov equation with the Bartels-Stewart algorithm and the Schur-Decomposition, one ends up with these type of updates.

(2-4) If we use C <- alpha * A * D * A^T + beta * C, we can reduce data movement from A and A^T by "combining" the data movement of A and A^T as shown in https://doi.org/10.1145/3490148.3538587 https://doi.org/10.1109/SC41404.2022.00034 https://doi.org/10.1145/3558481.3591072 as opposed to use a GEMM follows by a GEMMT.

The is also an implementation of this functionality in https://github.com/SLICOT/SLICOT-Reference/blob/main/src/MB01RU.f, which is used in the case I mentioned in (2-3). But the approach is horribly slow.

(2-5) Also I do not like the fact that the symmetry of A*B^T relies on what A and B are. Argh. I do not like this. When we look at the interface C <- alpha * A * D * A^T + beta * C, C in input is 100% symmetric because we assume the upper part from the lower part, D in input is 100% symmetric because we assume the upper part from the lower part. So the output C is 100% symmetric. There is no arguing. C in output is symmetric. When we look at C <- alpha * A * B + beta * C. Argh. The symmetry of output C relies on the "quality" of A and the "quality" of B and the "quality" of A*B. I think there is more uncertainty on whether A*B^T or B^T * A is the same. It should be the same. Is it? With the GEMMT interface, it is far from clear which one you should use. Argh. I think this is error prone, and not good for stability. I do not have good examples. But I do not like it.

(2-6) Speaking of C <- alpha * A * D * A^T + beta * C, I am not sure what a reference algorithm would be. Mmm. Is there a "more stable way" to perform A * D * A^T?

Again, I could mention https://github.com/SLICOT/SLICOT-Reference/blob/main/src/MB01RU.f at this point.

(2-7) One argument that I can see is for GEMMT is that the users as a fast way to right-multiply symmetric matrix D by A. So, the user wants to do C <- alpha * A * D * A^T + beta * C, but they have a fast way to do B <= A * D. So then yes, GEMMT can be useful.

I implemented the routine for sake of completeness, since many of the optimized libraries provide it, but not regarding the meaningful usage, which is truly a medium to big problem as you pointed out. Ignoring the BLAS-like extensions, that are common to almost all the optimized libraries, is also not a good way on the other hand. First, if the reference implementation provides it, one had a bigger chance that the optimized version behave the same way. Second, while developing stuff on top of BLAS and LAPACK, many developers (at least the ones I know), switch back to the reference implementation for debugging purpose. Once a BLAS-like extensions is used, this either requires code changes or a build-system, which checks for the availability of the extensions. Third, if applications/libraries use BLAS-like extensions, they can crash library exchange systems like update-alternatives easily.

grisuthedragon · 2023-07-25T07:57:24Z

I share the doubts about adding industry standard extensions to what is (or used to be?) seen as the reference implementation of something that was largely agreed on by working groups decades ago. Though to some extent that would hold for Reference-LAPACK as well - is it the classical reference if old algorithms get upgraded and new ones added, or is it the new and improved UCDenver LAPACK ?

Even tough the BLAS standard is now a very old one, one can slowly adjust it a bit to match the changed requirements by the users (as long a now old functionality gets disturbed). Furthermore, there are some bad designs in the references interface as well, that could only be fixed by adding new routines, like the missing second scalar in daxpy, which is covered by daxbpy in many BLAS implementations, which is also not yet in the reference.

vladimir-ch · 2023-07-25T18:06:44Z

Would it be thinkable that the reference implements C <- alpha * A * D * A^T + beta * C under a non-cryptic name? It would would apparently cover a majority of use-cases of GEMMT and by being the reference it's in the privileged position to show how things should be done properly.

langou · 2023-07-25T19:00:48Z

One major issue with C <- alpha * A * D * A^T + beta * C is that we would want A to be input only.

And if D is tridiagonal, that's easy enough to do. If D is a regular m-by-m matrix, that's not going to work out without allocation. So that's probably the reason for the two alternatives: (1) GEMMT, and (2) BLAS_xSY_TRIDIAG_RK.

I had forgotten this "detail".

robert-mijakovic · 2023-07-26T10:27:41Z

From the top of my head, I can say that GEMMT is used by ABINIT and possibly other Computational Chemistry, Molecular Modelling, and Materials Science packages. I can have a look into other packages that use GEMMT if necessary.

robert-mijakovic · 2023-07-26T14:34:32Z

Yet two additional packages, PETSc and VASP.

robert-mijakovic · 2023-07-26T15:51:40Z

After some search, I came up with the following packages that depend on MUMPS, PETSc, and ABINIT which could possibly benefit from GEMMT.

MUMPS dependent packages: Trilinos, Goma, Palace, xSDK, MFEM, CEED, FrontISTR, Ipopt, HPDDM, FEniCS, z-PARES, Akantu, Tandem, Elmer FEM, Camelia, Clp (Coin-or linear programming), FreeFEM++, Cbc (Coin-or branch and cut), Kwant, OpenSees, TELEMAC-MASCARET, TOY2DAC
PETSc dependent packages: PISM, DAMASK-grid, Sundials, PISM, Ratel, SLEPc, AMReX, ExaGO, Sowing, Akantu, libMesh, FreeFEM, openCARP, Alquimia, DAMASK-mesh, PHIST, DFT-FE, ELSI, AMP, libpressio, STEPS (STochastic Engine for Pathway Simulation), FEniCS-DOLFINx, SfePy, Gmsh
ABINIT dependent packages: Yambo, BigDFT

ilayn · 2023-07-28T10:24:12Z

Would it be thinkable that the reference implements C <- alpha * A * D * A^T + beta * C under a non-cryptic name?

I was about to type this a few days ago but got distracted by work. xGEMMT indeed reads like the matmul(m, transpose(m)). I am also from the control theory side and I know why SLICOT uses them extensively since control algos typically love working with quadratic forms.

However if this function is going to be implemented in the reference, it should actually bring in some significant benefits beyond an implicit 2xDGEMM calls if possible in my opinion.

langou · 2023-07-28T14:07:37Z

I was about to type this a few days ago but got distracted by work. xGEMMT indeed reads like the matmul(m, transpose(m)).

(1) I find the fact that the input/output matrix C is symmetric is a significant omission in the GEMMT interface name, I was not expecting any symmetry with a name such as GEMMT(2) "GEMMT indeed reads like the matmul(m, transpose(m))." I think I understand what you mean but (2a) if it were a matmul(m, transpose(m)), that would be a SYRK, (2b) if it were matmul(m, transpose(m)), we would have only one argument M in the interface and not an A and a B, (2c) GEMMT enables four variants: matmul(a,b),matmul(a, transpose(b)), matmul(transpose(a),b) and matmul(transpose(a),transpose(b)). All in all, I understand that the intent is that it is something like a matmul(m, transpose(m)).

In any case, I think, at this point the name of the subroutine is set. The reason to include the routine is that it is widely used. Changing the name now would defeat the reason of including it.

langou · 2023-07-28T14:15:21Z

However if this function is going to be implemented in the reference, it should actually bring in some significant benefits beyond an implicit 2xDGEMM calls if possible in my opinion.

I think by "this function" you mean C <- alpha * A * D * A^T + beta * C for a general D, in which I case I strongly agree with you.

Advantage of BLAS_xSY_TRIDIAG_RK over GEMM + GEMMT: If D is tridiagonal, I think this C <- alpha * A * D * A^T + beta * C can bring some benefit over GEMM + GEMMT by (1) reducing data movement of A, (2) avoiding a B buffer to store A * D.

Advantage of GEMMT over GEMM: GEMMT is half the FLOPS of GEMM.

ilayn · 2023-07-29T09:59:40Z

I think by "this function" you mean C <- alpha * A * D * A^T + beta * C for a general D, in which I case I strongly agree with you.

Sorry for my terse style, I was typing way quicker than I can make sense. Indeed that's what I meant.

I find the fact that the input/output matrix C is symmetric is a significant omission in the GEMMT interface name,

Indeed. And fully agree with your following points.

Changing the name now would defeat the reason of including it.

I think we can live with a name change because there is no precedence to it, and like you mentioned, the naming is already not optimal. LAPACK names are already cryptic enough so in my seriously humble opinion, there is no reason to carry on the legacy of unofficial sources or suboptimal decisions made elsewhere.

There is enough legacy within LAPACK to live with and. For anyone that needs to switch to the reference implementation of this "GEMMT" they will need to modify code anyways. Thus I think it gives the freedom to correct this naming. But then again that's just my opinion.

Functionality-wise similar routines might be requested such as triangular-triangular multiplication that comes in Sylvester/Schur related algorithms. Hence I'm not sure halving the matmul flops is a killer feature. But again, these are all personal takes, if folks are using it apparently they need it.

langou · 2023-07-31T15:57:06Z

Functionality-wise similar routines might be requested such as triangular-triangular multiplication that comes in Sylvester/Schur related algorithms. Hence I'm not sure halving the matmul flops is a killer feature. But again, these are all personal takes, if folks are using it apparently they need it.

I agree that "similar routines might be requested such as triangular-triangular multiplication". We also had needs for similar "weird" kernels in <T>LAPACK. We were thinking to give names like matmult_AtrBtrCtr where each matrix has each shape in the interface. (Here tr for triangular for example.) "I'm not sure halving the matmul flops is a killer feature." FLOPS is one argument, but there are also arguments for (1) less storage, (2) in-place vs out-of-place and (3) access of data.

rileyjmurray · 2023-08-10T15:10:28Z

Joining the thread because I'm interested in this functionality.

I'm working on an iterative linear system solver with @PTNobel based on a block Krylov subspace method. This method (and many like it) need to compute small square matrices of the form P' G P where G is PSD. The context is that we'll have already computed GP for other reasons in the algorithm. So we'd benefit from a function to compute P' Q where Q is a general matrix.

I have a feature request for the hypothetical function that exploits the known symmetry of P' Q. In the uplo argument, it would be great to allow a "general" flag so that only one triangle is computed in terms of dot products, but then the result is copied to the remaining triangle. This will ensure explicit symmetry of the output matrix. We find in our applications that computing M = P' Q with GEMM results in a matrix that is symmetric up to a normwise relative error of 5e-16 in double precision, but this difference is apparently enough for one of chol(M, "lower") and chol(M, "upper") to fail when the other succeeds.

(Please excuse the lazy notation above w.r.t. chol.)

BLAS/SRC/zgemmt.f

mgates3 · 2023-08-10T16:10:00Z

gemmt was introduced in Intel MKL, circa 2015.

In cuBLAS, similar functionality is called syrkx and herkx, which predated the MKL gemmt function (not sure exactly what version of CUDA it was introduced).

Likewise in rocBLAS / hipBLAS, they are syrkx and herkx.

herk does

C = A A^H + C

for a Hermitian matrix C, computing only the lower or upper triangle.
herkx is an extended version of herk that does

C = A B^H + C

but again computes only the lower or upper triangle. In general, A B^H would be non-Hermitian, but herkx is useful when the user can guarantee that the result is Hermitian. As discussed above, a typical use case is

C = B D B^H, thus A = B D

where D is Hermitian. A specific instance of that is when D is diagonal, e.g., generating a Hermitian matrix from its EVD or computing the backward error of an eigenvalue problem,

A - Q Lambda Q^H.

As a completely different application, it is also useful to compute a triangular matrix, instead of a Hermitian matrix, when the user knows that one of the triangles will be zero. This could be used in larft to generate the T matrix for a block Householder reflector. (The input matrices are still rectangular, not triangular, we just have extra knowledge that the output is triangular.)

I agree the t in gemmt doesn't fit into the BLAS naming scheme at all. Does it mean transpose? Does it mean triangular? I personally like the herkx name, as it signifies that the output matrix is Hermitian, although that doesn't fit the rather odd use case above when the output matrix is triangular, not Hermitian. gemmt is also a bit more general than herkx in that it can specify both transA and transB, whereas herkx has only a single trans.

Also consider if we had this routine in other formats like packed or RFP, obviously the gemmt name doesn't generalize well, whereas hprkx (packed) or hfrkx (RFP) would make sense.

mgates3 · 2023-08-11T15:44:08Z

Another use case is in the Polar Decomposition. The QDWH iteration converges to a unitary matrix U, and then computes a Hermitian matrix H = U^H A. Again, we have special knowledge that U^H A is Hermitian. This is distinct from the use cases of A B A^H, with B Hermitian.

@luszczek suggests herkmm instead of herkx.

langou · 2023-08-12T22:40:44Z

@mgates3 very nice application! Yes, indeed, if for a given matrix A, you know U, the unitary polar factor of A, then you can recover the Hermitian polar factor by doing U^H * A. That's a really neat application. Thanks for bringing this up. And indeed the QDWH iterations algorithm does need GEMMT. It converges to U, the unitary polar factor of A, and so, "after convergence," it computes the Hermitian factor with U^H * A, which is exactly GEMMT. This is a very nice application. Thanks for mentioning.

langou · 2023-08-12T22:47:58Z

We need to speak about names.

The PR named the routine GEMMT.

The names HERKX and GEMMT are already used for this functionality in some BLAS libraries.

Then HERKMM is kind of nice and was suggested by @luszczek.

Then I suggested something like HEGEGEMM.

But @mgates3 mentioned that this can be used for triangular matrix too, so maybe we could call this TRGEGEMM.

If you have a strong opinion, write it below.

mgates3 · 2023-08-14T04:08:56Z

The triangular application is rather weird and probably rare. How often is A*B a triangular matrix? So I wouldn't name it based on triangular. Interpreting the output as Hermitian (or symmetric) is much more common.

I'm not sure why you're proposing repeating the ge in hegegemm. Is it to specify the type of each matrix A, B, C? We've never done that before, and if we did, it's unclear what the order should be: hegege or gegehe (since C is Hermitian).

hegemm would be okay.

I prefer some variant of herk because that's what it most looks like: the output matrix C is Hermitian, whereas in hemm the input matrix A or B is Hermitian.

weslleyspereira · 2023-08-21T21:18:01Z

Hi @grisuthedragon. Would you be able to merge to/rebase with the current master? The script for the Github Actions needed to be updated. Thanks!

grisuthedragon · 2023-08-22T07:49:19Z

Hi @grisuthedragon. Would you be able to merge to/rebase with the current master? The script for the Github Actions needed to be updated. Thanks!

Done.

grisuthedragon · 2023-08-30T12:27:54Z

So finally, it seems that only the naming is the problem. I see, that from the typical BLAS naming scheme, the GEMMT name is confusing, so I do not have a problem to rename it.

As far as I collect the stuff from the discussion, there are the following options:

Already in use:

xGEMMT, bad from the naming scheme point of view but in MKL, OpenBLAS, ArmPL, ESSL, BLIS
xSYRKX/xHESYRKX, good name from the BLAS point of view, already in CUDA and rocBLAS

Potential names by functionality:

xSYGEMM/xHEGEMM
xSYRKMM/xHERKMM
xSYGEGEMM/HEGEGEMM

Since the functionality already exists in some implementations, I would go for one of the existing names.

After the discussion on Reference-LAPACK#887 the name changed from xGEMMT to xGEMMTR.

langou · 2024-06-24T16:14:03Z

Everyone, please review this commit. Comments welcome. Please comment by Thursday June 27th.

BLAS/SRC/cgemmtr.f

BLAS/SRC/dgemmtr.f

BLAS/SRC/sgemmtr.f

BLAS/SRC/zgemmtr.f

mgates3

Just the minor Doxygen group fix. Otherwise it seems fine.

updated: * BLAS/SRC/sgemmtr.f * BLAS/SRC/zgemmtr.f * BLAS/SRC/cgemmtr.f * BLAS/SRC/dgemmtr.f

grisuthedragon · 2024-06-27T12:05:32Z

I changed and added the doxygen group.

grisuthedragon force-pushed the implement-gemmt branch from 149c24c to 68ae7c8 Compare July 24, 2023 15:38

langou previously approved these changes Jul 24, 2023

View reviewed changes

mgates3 reviewed Aug 10, 2023

View reviewed changes

BLAS/SRC/zgemmt.f Outdated Show resolved Hide resolved

grisuthedragon dismissed langou’s stale review via b014222 August 22, 2023 07:37

grisuthedragon force-pushed the implement-gemmt branch from 5183843 to b014222 Compare August 22, 2023 07:37

Martin Köhler added 18 commits June 24, 2024 14:04

Fix implicit variable

fb5325d

Fix further implicit variables

2f80551

Fix missing comma

05d01da

Remove useless variable

6f66c83

Fix wrong write

785d734

Fix another variable

6173b6e

Rename GEMMT to GEMMTR in BLAS/

327869d

After the discussion on Reference-LAPACK#887 the name changed from xGEMMT to xGEMMTR.

Change xGEMMT to xGEMMTR in CBLAS/

81b3767

Adjust BLAS test for routine names with at most 7 characters

cb81e00

Fix missing arguments in tests

f9ea71e

Adding cblas_zgemmtr test

60d0e76

Working CBLAS_ZGEMMTR Test

b721a55

Update comments

adaf724

Add cblas_dgemmtr test

b681b1e

add cblas_sgemmtr tests

63d2b3a

Working error tests on cblas_cgemmtr

8571780

Add tests for cblas_s/d/zgemmtr

34adaba

Fix CMake Build

0e37c5c

grisuthedragon force-pushed the implement-gemmt branch from da1646c to 0e37c5c Compare June 24, 2024 13:37

grisuthedragon changed the title ~~Implement xGEMMT and cblas_xGEMMT~~ Implement xGEMMTR and cblas_xGEMMTR Jun 24, 2024

langou previously approved these changes Jun 24, 2024

View reviewed changes

mgates3 reviewed Jun 26, 2024

View reviewed changes

BLAS/SRC/cgemmtr.f Outdated Show resolved Hide resolved

BLAS/SRC/dgemmtr.f Outdated Show resolved Hide resolved

BLAS/SRC/sgemmtr.f Outdated Show resolved Hide resolved

BLAS/SRC/zgemmtr.f Outdated Show resolved Hide resolved

mgates3 suggested changes Jun 26, 2024

View reviewed changes

Add gemmtr group to Doxygen

c57c156

updated: * BLAS/SRC/sgemmtr.f * BLAS/SRC/zgemmtr.f * BLAS/SRC/cgemmtr.f * BLAS/SRC/dgemmtr.f

grisuthedragon dismissed langou’s stale review via c57c156 June 27, 2024 12:04

langou approved these changes Jun 28, 2024

View reviewed changes

langou merged commit 694e337 into Reference-LAPACK:master Jun 28, 2024
12 checks passed

cdluminate mentioned this pull request Jan 24, 2025

lapack 3.12.1 compatibility with blis 1.1 flame/blis#848

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement xGEMMTR and cblas_xGEMMTR #887

Implement xGEMMTR and cblas_xGEMMTR #887

grisuthedragon commented Jul 24, 2023 •

edited

Loading

codecov bot commented Jul 24, 2023 •

edited

Loading

langou commented Jul 24, 2023 •

edited

Loading

langou commented Jul 24, 2023

martin-frbg commented Jul 25, 2023

grisuthedragon commented Jul 25, 2023

grisuthedragon commented Jul 25, 2023

vladimir-ch commented Jul 25, 2023

langou commented Jul 25, 2023

robert-mijakovic commented Jul 26, 2023

robert-mijakovic commented Jul 26, 2023

robert-mijakovic commented Jul 26, 2023

ilayn commented Jul 28, 2023

langou commented Jul 28, 2023 •

edited

Loading

langou commented Jul 28, 2023 •

edited

Loading

ilayn commented Jul 29, 2023

langou commented Jul 31, 2023

rileyjmurray commented Aug 10, 2023

mgates3 commented Aug 10, 2023

mgates3 commented Aug 11, 2023

langou commented Aug 12, 2023

langou commented Aug 12, 2023

mgates3 commented Aug 14, 2023

weslleyspereira commented Aug 21, 2023

grisuthedragon commented Aug 22, 2023

grisuthedragon commented Aug 30, 2023

langou commented Jun 24, 2024

mgates3 left a comment

grisuthedragon commented Jun 27, 2024

Implement xGEMMTR and cblas_xGEMMTR #887

Implement xGEMMTR and cblas_xGEMMTR #887

Conversation

grisuthedragon commented Jul 24, 2023 • edited Loading

codecov bot commented Jul 24, 2023 • edited Loading

Codecov Report

langou commented Jul 24, 2023 • edited Loading

langou commented Jul 24, 2023

martin-frbg commented Jul 25, 2023

grisuthedragon commented Jul 25, 2023

grisuthedragon commented Jul 25, 2023

vladimir-ch commented Jul 25, 2023

langou commented Jul 25, 2023

robert-mijakovic commented Jul 26, 2023

robert-mijakovic commented Jul 26, 2023

robert-mijakovic commented Jul 26, 2023

ilayn commented Jul 28, 2023

langou commented Jul 28, 2023 • edited Loading

langou commented Jul 28, 2023 • edited Loading

ilayn commented Jul 29, 2023

langou commented Jul 31, 2023

rileyjmurray commented Aug 10, 2023

mgates3 commented Aug 10, 2023

mgates3 commented Aug 11, 2023

langou commented Aug 12, 2023

langou commented Aug 12, 2023

mgates3 commented Aug 14, 2023

weslleyspereira commented Aug 21, 2023

grisuthedragon commented Aug 22, 2023

grisuthedragon commented Aug 30, 2023

langou commented Jun 24, 2024

mgates3 left a comment

Choose a reason for hiding this comment

grisuthedragon commented Jun 27, 2024

grisuthedragon commented Jul 24, 2023 •

edited

Loading

codecov bot commented Jul 24, 2023 •

edited

Loading

langou commented Jul 24, 2023 •

edited

Loading

langou commented Jul 28, 2023 •

edited

Loading

langou commented Jul 28, 2023 •

edited

Loading