Question about usage of torch.nn.functional.conv2d

There are two image tensors each containing n images having size [n,3,Width,Height] and [n,3,Width/2,Height/2]
And I am trying to get convolution tensor output having size [n,3,Width,Height] by using torch.nn.functional.conv2d
What kind of options and padding should I use to achieve this?

2 Likes

In the first case, your convolution should keep the spatial shape of your input.
To achieve this, your padding should be ((kernel_size) - 1) / 2 for odd sized kernels (and default stride, dilation etc.).
This code would give you the same shape:

n = 5
c, h, w = 3, 12, 12
x1 = torch.randn(n, c, h, w)

conv1 = nn.Conv2d(in_channels=3, out_channels=3, kernel_size=3, padding=1)
output1 = conv1(x1)
print(output1.shape)
> torch.Size([5, 3, 12, 12])

In the second example it seems your input images are smaller so that you could use nn.ConvTranspose2d to increase the spatial size:

x2 = torch.randn(n, c, h//2, w//2)

conv2 = nn.ConvTranspose2d(in_channels=3, out_channels=3, kernel_size=2, stride=2)
output2 = conv2(x2)
print(output2.shape)
> torch.Size([5, 3, 12, 12])
2 Likes

Thanks for the reply.
However what I really wanted is convolution between two images by using torch.nn.functional.conv2d
In other word, I am trying to use smaller image as convolution kernel.
Having real hard time here…

Ah OK, I’ve misunderstood your question.
If you are using odd shaped kernels, the padding would be straightforward, as we need to pad on both sides of the input. However, using even shaped kernels, we could use the functional pad method to add the appropriate padding to the input.
Here is a small example. If you don’t need to backpropagate through the kernel (smaller image), just set requires_grad=False:

batch_size = 1
c, h, w = 3, 12, 12
num_kernels = 1
kernel_size = (h//2, w//2)

x = torch.randn(batch_size, c, h, w)
kernel = torch.randn(num_kernels, c, *kernel_size, requires_grad=True)

padding = []
for k in kernel_size:
    if k%2==0:
        pad = [(k-1)//2, (k-1)//2+1]
    else:
        pad = [(k-1)//2, (k-1)//2]
    padding.extend(pad)
    
x = F.pad(x, pad=padding)
output = F.conv2d(x, kernel)
print(output.shape)
> torch.Size([1, 1, 15, 15])
1 Like

thank you very much
It helped me a lot