F.conv2d functionality


The following code is from PyTorch master documentation but I can not understand it; because I expect when we have an input with dimension [batch size, 1 , 3, 3], a filter tensor with dimension of [1, 1, 2, 2] should exist. As I think, F.conv2d would generate a tensor with dimension [batch_size, 1, 1, 1]. May I ask you to explain about the functionality of the following code and the way that I could see the real convolution like what I described with this code?

import torch
import torch.nn.functional as F

filters = torch.randn(8,2,2,2)
inputs = torch.randn(1,2,3,3)
a = F.conv2d(inputs, filters)



The only way to have a output such [batch, 1, 1, 1] is to have a filter with same_size as input when padding=0 and stride=1.

The output have to be [batch, 1, 2, 2] for your example input and filters. And the reason is you can have four 2x2 squares in a 3x3 matrix.

step one:
[[x, x, 0],
 [x, x, 0],
 [0 ,0, 0]]

step two:
[[0, y, y],
 [0, y, y],
 [0 ,0, 0]]

step three:
[[0, 0, 0],
 [z, z, 0],
 [z ,z, 0]]

step four:
[[0, 0, 0],
 [0, s, s],
 [0 ,s, s]]

[[x, y],
 [z, s]]

So as you can see, we can extract 4 numbers which will be in a 2x2 matrix.

PS: number of filters determine number of channels in the output. Each filter a channel.

Good luck

Thanks, it could help me.