Non-uniformly changing convolution weights. Is it possible?

csbalint · June 17, 2021, 12:38am

Let’s suppose we have convolutional module. Right now it is boring:

class My_Conv(nn.Module):
    def __init__(self, kwargs):
        super(My_Conv, self).__init__()
        
        self.conv = nn.Conv2d(**kwargs)
    def forward(self, x):
        return self.conv(x)

But I want to do something like this:

class My_Conv(nn.Module):
    def __init__(self, kwargs):
        super(My_Conv, self).__init__()
        self.something = Something()
        self.conv = nn.Conv2d(**kwargs)
    def forward(self, x):
        alpha=self.something(x)
        return self.conv(x,weight_multipliers=alpha)

Let’s suppose I have an alpha tensor. Alpha contains multipliers for the convolution weights. But not globally for the whole convolution. It contains different multipliers for each stride of the convolution window. Think of it as some kind of attention.
Now I want to get the convolution outputs just as before, but first I want the weights to be multiplied by the relevant numbers from alpha, for each given window stride.

Is this possible to do efficiently in Pytorch? Is there a hack for this without reimplementing the convolution operation? I looked in the source, and I don’t feel like messing around on the C++ level.

ptrblck · June 17, 2021, 5:34am

I think the easiest approach would be to unfold the inputs, apply the weights to each patch, and use a matrix multiplication approach for the convolution, which will most likely use a lot of memory and be slow.

Checking the native conv implementation in C++ could yield a speedup, but I also understand this wouldn’t be interesting to you.

csbalint · June 17, 2021, 9:34am

Thank you for the suggestion! I will try it, and share my results. If it works well in python (apart from the performance) maybe it will motivate me to seek a more efficient solution such as hacking away in C++.