How to create an own novel convolution kernel?

1435b992edc81c3b31ba · April 4, 2019, 8:59am

We can use torch.nn.Conv2d to create an usual convolution layer, but if I want to creat a convolution layer with a kernel of novel shape, such as ‘T’ shape(means with kernel weight of [[w1,0],[0,0]] ),and the parameter’w1’ can be learned. what should I do?
I use the function

class Redchannel_Conv(nn.Module):
def init(self):
super(Redchannel_Conv, self).init()
w1 = 0.1
self.W1 = nn.Parameter(torch.Tensor([[w1,0],[0,0]]))
def forward(self,x):
print(self.W1)
return F.conv2d(x, self.W1, stride=2)
but have the error: RuntimeError: weight should at least have at least two dimensions

Could somebody help me? Thanks very much!

ptrblck · April 4, 2019, 9:49am

The kernels for nn.Conv2d have the shape [out_channels, in_channels, height, width].
It seems some channel dimensions are missing.
Could you explain your use case a bit, i.e. what kind of input do you have (number of channels etc.) and how should the kernel be applied on it?

1435b992edc81c3b31ba · April 4, 2019, 10:07am

thanks for your rely .nn.Conv2d can’t design an own kernel.I use the F.conv2d.my input is hyperspectral images with size(4,31,256,256),4 is batch size,31 is channel,256 is height and width .

ptrblck · April 4, 2019, 10:15am

Sure, but nn.Conv2d and F.conv2d expect the same kernel shapes, so I just used it as a bad example.

I’m still not sure how the kernel should be applied, i.e. on each input channel separately or as a vanilla convolution using all input channels.
However, here is a small example for both approaches:

w1 = 0.1
W1 = nn.Parameter(torch.tensor([[[[w1,0],[0,0]]]]).expand(-1, 31, -1, -1))
output = F.conv2d(x, W1, stride=2)

# or using groups=31 so that the single kernel is applied on each channel separately
W1 = nn.Parameter(torch.tensor([[[[w1,0],[0,0]]]]).expand(31, -1, -1, -1))
output = F.conv2d(x, W1, stride=2, groups=31)

1435b992edc81c3b31ba · April 4, 2019, 10:22am

wow,thank you very much:smiley:.and I want to ask whether I use’w1 = 0.1’ is right ,I don’t know whether the ‘w1’ will update with the back propagation

ptrblck · April 4, 2019, 10:24am

W1 will be updated as it’s wrapped in nn.Parameter, but w1 will not, as it’s a simple floating number.
Also, the initialization should of course only happen once.
Once W1 is created, you can use it in your model like a normal convolution.

1435b992edc81c3b31ba · April 4, 2019, 10:30am

so I write it wrong .My purpose is to design a kernel with shape 2*2 ,which only update the parameter’w1’ ,the other three parameter have to be fixed as 0 .how can I do it ?

ptrblck · April 4, 2019, 10:48am

You could zero out the gradients of all other parameters before calling optimizer.step().

1435b992edc81c3b31ba · April 4, 2019, 10:59am

thank you ,but I don’t know how to zero out the other gradients .I am not sure what you means ,how can I do this .can you give a example?

ptrblck · April 4, 2019, 11:44am

One way would be to register a hook and multiply it with a mask to zero out the unwanted gradients:

def zero_w1(mask):
    def hook(grad):
        grad = grad * mask
        return grad
    return hook

w1 = 0.1
W1 = nn.Parameter(torch.tensor([[[[w1,0],[0,0]]]]).expand(-1, 31, -1, -1))

mask = (W1 != 0.).float()
W1.register_hook(zero_w1(mask))
output = F.conv2d(x, W1, stride=2)
output.mean().backward()
print(W1.grad)

1435b992edc81c3b31ba · April 4, 2019, 11:52am

thank you so much!!! I will try it at once