Error while using Sparse Convolution Function (Conv2d with sparse weights)

DrImpossible · June 2, 2019, 2:53am

Hi,

I implemented a SparseConv2d (with sparse weights and dense inputs) to reimplement my paper however while trying to train, I am getting this issue:

Traceback (most recent call last):
  File "train_test.py", line 169, in <module>
    optimizer.step()
  File "/home/drimpossible/installs/3/lib/python3.6/site-packages/torch/optim/sgd.py", line 106, in step
    p.data.add_(-group['lr'], d_p)
RuntimeError: set_indices_and_values_unsafe is not allowed on Tensor created from .data or .detach()

The SparseConv2d function is given below:

class SparseConv2d(torch.nn.Module):
    def __init__(self, inWCin, inWCout, kernel_size, stride=1, padding=0, dilation=1, sparse_size=0, sparse_mode='Expander'):
        super(SparseConv2d, self).__init__()
        self.kernel_size = kernel_size
        self.stride = stride
        self.padding = padding
        self.dilation = dilation
        self.out_channels = inWCout

        mask = make_sparse(in_dim=inWCin, out_dim=inWCout, size=sparse_size, mode=sparse_mode)
        weight = torch.zeros((inWCout, inWCin))
        weight = 0.01*torch.nn.init.kaiming_normal_(weight)
        weight = torch.mul(weight, mask)
        weight = weight.unsqueeze(2).unsqueeze(3).repeat(1, 1, kernel_size, kernel_size)
        weight = weight.view(weight.size(0), -1)
        weight = weight.to_sparse().cuda()
        self.weight = torch.nn.Parameter(weight, requires_grad=True)
    
    def forward(self, x):
        out = (x.size(2)+2*self.padding-self.dilation*(self.kernel_size-1)-1)//self.stride+1
        x_unf = torch.nn.functional.unfold(x, (self.kernel_size, self.kernel_size)).transpose(1,2)
        x_unf = torch.sparse.mm(self.weight, x_unf.reshape(-1, x_unf.size(2)).t()).t().reshape(x.size(0),-1,self.out_channels).transpose(1,2)
        x_unf = x_unf.view(x_unf.size(0), x_unf.size(1), out, out)
        return x_unf

The make_sparse function just returns an Erdös-Rényi random expander on LeNet, CIFAR10 with SGD as per the tutorial.

The full code for reproduction is available here: (https://pastebin.com/0PKN0EbJ). I tried hunting, SGD seems to support sparse ops. I traced the issue to here. How do I fix this?
Thanks!

AtenaNguyen · June 27, 2019, 8:20am

Hello,

Do you get any update on this issue?

Bests
Atena

DrImpossible · June 27, 2019, 3:28pm

Hi,

I didn’t get any yet. If you get any workarounds, let me know. Getting SGD to work seems like the last step essentially.

Regards,
Ameya

littlebirdoflargehea · May 15, 2020, 5:31am

This guy has solved the problem.