Sparse matrix computation in pytorch

Hi All,

I have a trained and pruned neural network in pytorch. after pruning it, i expected the inference time to be reduced but it doesn’t seem to be, so can someone please suggest me a way to do that?

basically what i want is that the zeroed out weights should not be computed in the forward pass of the graph so that the inference time is reduced.

This is a thread to follow for the above.

TLDR:

As of now, PyTorch prune’s function just works as a weight mask, that’s all. There are no memory savings associated with using torch.nn.utils.prune

thanks @Soumya_Kundu, that post is about the same problem i have mentioned here, but it doesn’t contain a clear solution on how I can achieve this in pytorch.

Here’s a gist explaining this

a clear solution on how I can achieve this in pytorch