As the title says, I have come across a point where I need to specifically use CSR/CSC sparse matrices and perform multiplication with them. It’ll be part of the forward step of a neural network, and so I need it to parallel on both GPU, and perhaps CPU, but GPU is my main focus.
While I have been using PyTorch for a couple months, I have never implemented a custom C++ extension before, and I am wondering if anyone could give me some guidance as to what I should do and how in depth this will probably end up being. I presume ATen does not support CSR/CSC matrices at this point, otherwise PyTorch itself would support it(my guess).
Would I need to separately implement a CSR/CSC class in pytorch, then implement the operation with something like CuSparse, or would it be all in one?
I do not need autograd functionality or anything to do with learning, this will be a post training operation.
In any case, thanks for the help and for making PyTorch a great platform to work with!