Question of 2D transpose Convolution

Hi - I was experimenting with ConvTranspose2d operation. I took a [2 x 2] random tensor and applied transpose conv on it with and without padding. Both the kernel size and stride are set to 2. When I checked the size of the tensor after the operation I found that the size of the output tensor without padding is bigger than with padding.

Code snippet with padding:
d=torch.randn(1,1,2,2)
deconv2=nn.ConvTranspose2d(1,1,kernel_size=2,stride=2,padding=1)
d1=deconv2(d)
d1.shape ==>> it returns torch.Size([1, 1, 2, 2])

Code snippet without padding:
deconv2=nn.ConvTranspose2d(1,1,kernel_size=2,stride=2,padding=0)
d1=deconv2(d)
d1.shape ==>> it returns torch.Size([1, 1, 4, 4])

I thought that the output tensor size with padding will be bigger than without padding.But it’s not the case. Can anyone please explain this?

Regards,
Tomojit

This should be expected.
Form the docs:

The padding argument effectively adds dilation * (kernel_size - 1) - padding amount of zero padding to both sizes of the input. This is set so that when a Conv2d and a ConvTranspose2d are initialized with same parameters, they are inverses of each other in regard to the input and output shapes. However, when stride > 1 , Conv2d maps multiple input shapes to the same output shape. output_padding is provided to resolve this ambiguity by effectively increasing the calculated output shape on one side. Note that output_padding is only used to find output shape, but does not actually add zero-padding to output.

Also, have a look at the conv arithmetic tutorial for more information.

Hi @ptrblck - Thanks for sharing the “conv arithmmetic” tutorial. I checked this tutorial before as well. I am still confused on padding. Consider the first case when I used padding=1. Note I didn’t use dilation. So, according to the PyTorch document (“dilation * (kernel_size - 1) - padding”) which equals to (0*(2-1)-1) = -1 amount of padding will be added. This does not make any sense to me. Am I doing any mistake?

@ptrblck - I also observe that when the stride is > 1 (say 2) the transpose Conv can’t reconstruct the original image size. But if I use unit stride then transpose Conv reconstructs the exact image size. See below:
Code snippet for perfect reconstruction:
In [1]: import torch
In [2]: D=torch.randn(1,1,28,28)
In [3]: import torch.nn as nn
In [4]: s,k,p=1,5,0
In [5]: conv1 = nn.Conv2d(1,1,kernel_size=k,stride=s,padding=p)
In [6]: deconv1 = nn.ConvTranspose2d(1,1,kernel_size=k,stride=s,padding=p)
In [7]: D1=conv1(D)
In [8]: D1_t = deconv1(D1)
In [10]: D1_t.shape
Out[10]: torch.Size([1, 1, 28, 28])==> size of the reconstructed image

Code snippet when reconstruction is not perfect when I use stride=2.
In [11]: s,k,p=2,5,0
In [12]: conv1 = nn.Conv2d(1,1,kernel_size=k,stride=s,padding=p)
In [13]: deconv1 = nn.ConvTranspose2d(1,1,kernel_size=k,stride=s,padding=p)
In [14]: D1 = conv1(D)
In [15]: D1_t = deconv1(D1)
In [16]: D.shape
Out[16]: torch.Size([1, 1, 28, 28])
In [17]: D1_t.shape
Out[17]: torch.Size([1, 1, 27, 27]) ==> size of the reconstructed image