I’m about to start adding batched sparse QR factorizations with cusolver to PyTorch. What’s the best way to handle a 3D sparse batched tensor (a batch of 2D matrices)? The batched QR factorization and solves also require that the tensors have the same sparsity pattern across the batch. I’d like to propose adding support for 3D sparse batch tensors with the same sparsity pattern using an interface like this.

nBatch = 2; nx =4
i = torch.LongTensor([range(nx), range(nx)])
v = torch.randn(nBatch, nx)
sz = torch.Size([nBatch, nx, nx])
A = torch.sparse.FloatTensor(i, v, sz)

What do you all think? For the sake of getting the batched QR factorizations and solves working, I’m going to pass dense tensors around without using the sparse wrapper. Eventually (ideally soon), I’d like for these functions to take sparse tensors as input.

Here’s the github issue for the factorizations for reference:

A note about my experience on using cusolver on sparse matrices.
I have implemented a wrapper around cusolver to solve sparse linear systems leveraging pytorch, and I need to say that I was disappointed with the performances for very large and very sparse matrices. Using the sparse linear solver from scipy lead to faster runtimes in my case.
I’m not sure about the performance for qr decomposition alone, but the solver of the linear system used qr factorization internally if I remember properly.

I’m not sure what the best course of action here is, but let me explain how the current sparse tensor representation works.

The current sparse representation (http://pytorch.org/docs/sparse.html) supports hybrid sparse tensors, where you can say that the first n dimensions are sparse, and the rest are dense; e.g., if you have a 3D tensor which only specifies a few 2D matrices in the stack. But this isn’t really what you are looking for: if I understand correctly, what you want is the first dimension to be dense, and then the second and third dimensions sparse (but with the same sparsity pattern.)

One thing you could do is transpose your input so that the sparse dimensions are at the beginning (so you have a 3D tensor which is a sparse 2D “matrix” whose entries are vectors, each entry corresponding to one of the entries from the batched 2D matrix.) This doesn’t seem completely awful to me, but I don’t have a well-developed aesthetic sense for this sort of thing yet.

Hi,
I am curious about if this is still the case that leveraging scipy sparse solver is a better option.
Which means that if I want to compute sparse matrix in my own Function, I first convert the tensor to scipy sparse matrix, and convert back to torch tensor ?

The main reason I am asking, is that I am searching for method to do something like the dense matrix case with btrifact() and btrisolve().
I am writing a layer that will solve several linear system, with different patterns across them.
Wondering if there is nice suggestion or anything I can do to help build up some relevant functions?

It might depends on your sparsity factor and the size of your matrices, for my use-case scipy was better.

I’d say, the easiest is definitely to use scipy to start with. Converting dense Tensors to numpy arrays is for free, so depending on how you store your data you could do the torch -> numpy sparse conversion very efficiently.
Then, if you find it’s a bottleneck, I’d look for other alternatives, but my experience with solving linear systems with cusolver was not the best.