Sparse tensor use cases

We are working on to increase supports for sparse tensor. Currently we have summarized current state of sparse tensor and listed out sparse ops to support. We would like to collect sparse tensor use cases to facilitate the design decisions. It will be very helpful if you can post your use cases and desired sparse ops here or at Thanks!

I find these questions useful when writing use cases:

Where do I need sparse tensor? During training deep learning model?
Do I need autograd support for the sparse ops?

A possible example will be:

I am training model that has mul(Sparse, Dense) ops. I would like to have its forward and backward. I know there will be a dense gradient at the backward of mul, so here I am asking for a special kind of mul ops (called sparse_mul) that returns a sparse grad tensor and only keep the nnz’s gradients.

I previously had a use-case wherein I was training an auto-encoder that learned rank 4 tensors that modeled the weights between a large graph of words. The majority of the words shared no weights and were thus 0. I also needed to normalize the columns of each matrix (at the rank 2 level) of these tensors.

I found that very few of the basic tensor operations for dense vectors were implemented for sparse vectors (mm products, etc), and there was no easy way to normalize. I ended up needing to do a ton of hacky things to reformat my problem with dense vectors that were rank 3 in order to be able to feasibly run all of the computations.

Idk if this is too vague to be helpful…

I have a very sparse dataset that is organized as a scipy sparse csr_matrix and it is too large to convert it to a single dense numpy array. For now, I can only extract part of it and convert that part to an numpy array, then to a tensor and forward the tensor. But the csr_matrix to numpy array step is still awfully time-consuming.

Right now I have a solution as below, which is quite fast:

def spy_sparse2torch_sparse(data):

    :param data: a scipy sparse csr matrix
    :return: a sparse torch tensor
    return t

But it is still not very helpful. I need to sample a mini-batch out of the whole dataset, feed a classifier that mini-batch and update the weights of the classifier. If mini-batch sampling is supported, it will be great.

This makes sense. norm() is already supported in sparse for computing global norm among all values with exponent=2, I guess what you need is the standard one as in dense: torch.norm(input, p, dim, keepdim=False, out=None). I will add this to TODO. btw, do you also need backward for this?

I’m actually not working in it anymore and found workarounds. But in that instance, I did not need backwards on the norm. Of course I can see that being useful…

I guess what you need is a sampler that can sort the sparse dataset by batch dim and efficiently return mini-batch according to batch dim. But there is a harder problem, that is to have batch op supports (e.g. bmm). I will take a look at this.