You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a setup where I have a dense matrices $A, B$ and a 2:4 sparsity mask $M$. Is there an API in torchao where I can perform $A (B \odot M)^T$ and get the speedups from 2:4 GEMMs? That is, instead of having a pre-sparsified matrix $B$, I want to apply $M$ to $B$ online and then do the sparse GEMM.
The text was updated successfully, but these errors were encountered:
sm = torch.sparse.to_sparse_structured(B*M)
y = torch.mm(sm, A.T).T
This sometimes works but I sometimes get NotImplementedError: `SparseSemiStructuredTensorCUSPARSELT` matmul: operation is not supported
in reference to
[rank7]: File "/home/alberttseng/miniconda3/lib/python3.12/site-packages/torch/sparse/_semi_structured_ops.py", line 122, in semi_sparse_mm
[rank7]: res = A._mm(B_padded)
Do you have a script to repro @tsengalb99? I wouldn't expect a transient error here, I wonder what it could be.
In any case, I doubt your approach would be faster unless on very large matrices. I think to be faster you'd have to do something like outlined here. Note that this doesn't use to_sparse_semi_structured and uses torch._sparse_semi_structured_apply instead.
I have a setup where I have a dense matrices$A, B$ and a 2:4 sparsity mask $M$ . Is there an API in torchao where I can perform $A (B \odot M)^T$ and get the speedups from 2:4 GEMMs? That is, instead of having a pre-sparsified matrix $B$ , I want to apply $M$ to $B$ online and then do the sparse GEMM.
The text was updated successfully, but these errors were encountered: