Lets say I give two input epicures of size 234x234. Then I have an input tensor [2,3,234,234]. How do I create a 3d kernel which only take the 3 channels as input and produces 1 output feature map from it?
You could create an nn.Conv2d
layer accepting inputs with 3 channels and creating a single output activation map as seen here:
x = torch.randn(2, 3, 234, 234)
conv = nn.Conv2d(in_channels=3, out_channels=1, kernel_size=3, padding=1)
out = conv(x)
print(out.shape)
# torch.Size([2, 1, 234, 234])
Note that the filter will be a 4D filter defined as [out_channels, in_channels, height, width]
.