Is PyTorch fully equipped to support network training using sparse tensors?

Are PyTorch layers and operations within the nn.Module capable of fully supporting sparse tensors by default?

I am aiming to train a CNN for both 2D and 3D images. There are external libraries available, such as Nvidia’s Minkowski Engine, SPConv, Numenta, and PyTorch Sparse which efficiently handle sparse data and can accelerate the training process. However, most of these projects are, on average, around 4 years old, and their compatibility with Windows and particularly Ubuntu 22.02 is either limited or non-existent. The compilation of these libraries and their integration into existing projects poses considerable challenges.

I have come across several issues on the PyTorch GitHub repository that pertain to requests for features related to sparse data. Notably, a few of these mentions are highlighted below in chronological order:

I would greatly appreciate any comments or suggestions you may have.

Hi Stark!

To the best of my knowledge, no.

Pytorch’s support for sparse tensors remains quite incomplete.

It is, for example, possible in pytorch version 2.0.1 (with some work and a lot
of caveats) to pass a sparse tensor through a Linear, but I don’t believe that
the analogous is possible with a Conv2d.

Also, regardless of what pytorch supports, as a tensor with lots of zeros (that
is, a tensor that is sparse mathematically, rather than in terms of how it is
stored) passes through the layers of a network, it typically will become less
and less sparse. So the bulk of mainstream pytorch use case – complicated
networks for machine learning – will likely not get any benefit from sparse-tensor


K. Frank

Thank you @KFrank for providing valuable insight.

I wonder if those working on 3D scene reconstruction (or similar problems dealing with sparse point clouds) spend as much time training models as those working with dense data.

Can replacing a standard convolution layer with a sparse convolution (or submanifold sparse convolution) layer speed up the whole training process? I read that such a technique is friendlier on TPU than a GPU, and your reply also indicates the same!?