How to create an own novel convolution kernel?

We can use torch.nn.Conv2d to create an usual convolution layer, but if I want to creat a convolution layer with a kernel of novel shape, such as ‘T’ shape(means with kernel weight of [[w1,0],[0,0]] ),and the parameter’w1’ can be learned. what should I do?
I use the function

class Redchannel_Conv(nn.Module):
def init(self):
super(Redchannel_Conv, self).init()
w1 = 0.1
self.W1 = nn.Parameter(torch.Tensor([[w1,0],[0,0]]))
def forward(self,x):
return F.conv2d(x, self.W1, stride=2)
but have the error: RuntimeError: weight should at least have at least two dimensions

Could somebody help me? Thanks very much!

1 Like

The kernels for nn.Conv2d have the shape [out_channels, in_channels, height, width].
It seems some channel dimensions are missing.
Could you explain your use case a bit, i.e. what kind of input do you have (number of channels etc.) and how should the kernel be applied on it?

thanks for your rely .nn.Conv2d can’t design an own kernel.I use the input is hyperspectral images with size(4,31,256,256),4 is batch size,31 is channel,256 is height and width .

Sure, but nn.Conv2d and F.conv2d expect the same kernel shapes, so I just used it as a bad example. :wink:

I’m still not sure how the kernel should be applied, i.e. on each input channel separately or as a vanilla convolution using all input channels.
However, here is a small example for both approaches:

w1 = 0.1
W1 = nn.Parameter(torch.tensor([[[[w1,0],[0,0]]]]).expand(-1, 31, -1, -1))
output = F.conv2d(x, W1, stride=2)

# or using groups=31 so that the single kernel is applied on each channel separately
W1 = nn.Parameter(torch.tensor([[[[w1,0],[0,0]]]]).expand(31, -1, -1, -1))
output = F.conv2d(x, W1, stride=2, groups=31)

wow,thank you very much:smiley:.and I want to ask whether I use’w1 = 0.1’ is right ,I don’t know whether the ‘w1’ will update with the back propagation

W1 will be updated as it’s wrapped in nn.Parameter, but w1 will not, as it’s a simple floating number.
Also, the initialization should of course only happen once.
Once W1 is created, you can use it in your model like a normal convolution.

so I write it wrong .My purpose is to design a kernel with shape 2*2 ,which only update the parameter’w1’ ,the other three parameter have to be fixed as 0 .how can I do it ?

You could zero out the gradients of all other parameters before calling optimizer.step().

thank you ,but I don’t know how to zero out the other gradients .I am not sure what you means ,how can I do this .can you give a example?

One way would be to register a hook and multiply it with a mask to zero out the unwanted gradients:

def zero_w1(mask):
    def hook(grad):
        grad = grad * mask
        return grad
    return hook

w1 = 0.1
W1 = nn.Parameter(torch.tensor([[[[w1,0],[0,0]]]]).expand(-1, 31, -1, -1))

mask = (W1 != 0.).float()
output = F.conv2d(x, W1, stride=2)

thank you so much!!! I will try it at once :grinning::grinning: