I have a 3D tensor with each channel to be convolved with the one single kernel. From a quick search, the fastest way to do this was to use grouped convolution with number of groups to be the number of channels.
Here is a small reproducible example:
import torch import torch.nn as nn torch.manual_seed(0) x = torch.rand(1, 3, 3, 3) first = x[:, 0:1, ...] second = x[:, 1:2, ...] third = x[:, 2:3, ...] kernel = nn.Conv2d(1, 1, 3) conv = nn.Conv2d(3, 3, 3, groups=3) conv.weight.data = kernel.weight.data.repeat(3, 1, 1, 1) conv.bias.data = kernel.bias.data.repeat(3) >>> conv(x) tensor([[[[-1.0085]], [[-1.0068]], [[-1.0451]]]], grad_fn=<MkldnnConvolutionBackward>) >>> kernel(first), kernel(second), kernel(third) (tensor([[[[-1.0085]]]], grad_fn=<ThnnConv2DBackward>), tensor([[[[-1.0068]]]], grad_fn=<ThnnConv2DBackward>), tensor([[[[-1.0451]]]], grad_fn=<ThnnConv2DBackward>))
Which you can see perfectly works.
Now coming to my question. I need to do backprop on this (
kernel object). While doing this, each weight of the
conv gets its own update. But actually,
conv is made up of
kernel repeated 3 times. At the end I require only an updated
kernel. How do I do this?
PS: I need to optimize for speed