We can use `torch.nn.Conv2d`

to create an usual convolution layer, but if I want to creat a convolution layer with a kernel of novel shape, such as ‘T’ shape(means with kernel weight of [[w1,0],[0,0]] ),and the parameter’w1’ can be learned. what should I do?

I use the function

class Redchannel_Conv(nn.Module):

def **init**(self):

super(Redchannel_Conv, self).**init**()

w1 = 0.1

self.W1 = nn.Parameter(torch.Tensor([[w1,0],[0,0]]))

def forward(self,x):

print(self.W1)

return F.conv2d(x, self.W1, stride=2)

but have the error: RuntimeError: weight should at least have at least two dimensions

Could somebody help me? Thanks very much!

1 Like

The kernels for `nn.Conv2d`

have the shape `[out_channels, in_channels, height, width]`

.

It seems some channel dimensions are missing.

Could you explain your use case a bit, i.e. what kind of input do you have (number of channels etc.) and how should the kernel be applied on it?

thanks for your rely .nn.Conv2d can’t design an own kernel.I use the F.conv2d.my input is hyperspectral images with size(4,31,256,256),4 is batch size,31 is channel,256 is height and width .

Sure, but `nn.Conv2d`

and `F.conv2d`

expect the same kernel shapes, so I just used it as a bad example.

I’m still not sure how the kernel should be applied, i.e. on each input channel separately or as a vanilla convolution using all input channels.

However, here is a small example for both approaches:

```
w1 = 0.1
W1 = nn.Parameter(torch.tensor([[[[w1,0],[0,0]]]]).expand(-1, 31, -1, -1))
output = F.conv2d(x, W1, stride=2)
# or using groups=31 so that the single kernel is applied on each channel separately
W1 = nn.Parameter(torch.tensor([[[[w1,0],[0,0]]]]).expand(31, -1, -1, -1))
output = F.conv2d(x, W1, stride=2, groups=31)
```

wow,thank you very much:smiley:.and I want to ask whether I use’w1 = 0.1’ is right ,I don’t know whether the ‘w1’ will update with the back propagation

`W1`

will be updated as it’s wrapped in `nn.Parameter`

, but `w1`

will not, as it’s a simple floating number.

Also, the initialization should of course only happen once.

Once `W1`

is created, you can use it in your model like a normal convolution.

so I write it wrong .My purpose is to design a kernel with shape 2*2 ,which only update the parameter’w1’ ,the other three parameter have to be fixed as 0 .how can I do it ?

You could zero out the gradients of all other parameters before calling `optimizer.step()`

.

thank you ,but I don’t know how to zero out the other gradients .I am not sure what you means ,how can I do this .can you give a example?

One way would be to register a hook and multiply it with a mask to zero out the unwanted gradients:

```
def zero_w1(mask):
def hook(grad):
grad = grad * mask
return grad
return hook
w1 = 0.1
W1 = nn.Parameter(torch.tensor([[[[w1,0],[0,0]]]]).expand(-1, 31, -1, -1))
mask = (W1 != 0.).float()
W1.register_hook(zero_w1(mask))
output = F.conv2d(x, W1, stride=2)
output.mean().backward()
print(W1.grad)
```

thank you so much!!! I will try it at once