Lack of implementation in current CUDA driver for sparse matrix multiplication (NYI CUDA sspadmm Runtime Error)

I’m trying to multiply a sparse matrix with a dense matrix, but I’ve encountered an error I can’t find mentioned anywhere on the forums or GitHub and I’m suspicious it is related to the current CUDA toolkit that I’m using, or perhaps the torch version, where the function is simply not implemented, but I’m not sure:

def ptr_loss(real: torch.Tensor, fake: torch.Tensor, laplacian_real: torch.Tensor, laplacian_fake: torch.Tensor):
    return real.smm(laplacian_fake).mm(real.T) + fake.smm(laplacian_real).mm(fake.T)

Here, real and fake are both a [1,66536] size tensor, and laplacian_fake and laplacian_real are a size [65536,65536] sparse tensor (coo). I’m attempting to use torch.Tensor.smm function to multiply the dense and sparse matrix, but when I reach this point in code, I get a Runtime error:

RuntimeError: NYI: CUDA sspaddmm is not implemented

I am using Torch 1.12.1 with CUDA 11.7 on driver version 516.94, on a RTX 3090. The Torch was installed from Conda, and running on Windows 10.

Any ideas how to fix this issue or what is the actual cause?

Hi Tibor!

I can reproduce your issue on both pytorch 1.12.0 / cuda 11.6 and pytorch
1.14.0.dev20221014 / cuda 11.7.

This github issue suggests using torch.sparse.mm() as a replacement.
Note that torch.sparse.mm() returns a dense tensor (on both the cpu and
gpu), while torch.smm() returns a sparse tensor (on the cpu and fails on the
gpu). (For your use case I doubt that returning a dense tensor matters, as
you’ve already contracted over the large dimension.)

Not that it changes the core issue, but your attempted use of .smm() is
incorrect in that real.smm (laplacian_fake) is attempting to left-multiply
a dense matrix onto a sparse matrix while .smm() is intended to left-multiply
a sparse matrix onto a dense matrix.

Why pytorch doesn’t have cuda support for .smm() I don’t know. However,
pytorch’s support for sparse tensors is in general incomplete and buggy.

Best.

K. Frank

1 Like

Hello Frank, and thank you for the insight on the issue!

I have tried using torch.sparse.mm and I can confirm it has fixed the issue.

I want to point out to others in case they encounter this that torch.sparse.mm requires for the first input to be the sparse matrix, and the output is a dense matrix. You may need to take advantage of matrix multiplication associativity to reorder your operations to use the function.