# Question of 2D transpose Convolution

Hi - I was experimenting with ConvTranspose2d operation. I took a [2 x 2] random tensor and applied transpose conv on it with and without padding. Both the kernel size and stride are set to 2. When I checked the size of the tensor after the operation I found that the size of the output tensor without padding is bigger than with padding.

Code snippet with padding:
d=torch.randn(1,1,2,2)
d1=deconv2(d)
d1.shape ==>> it returns torch.Size([1, 1, 2, 2])

Code snippet without padding:
d1=deconv2(d)
d1.shape ==>> it returns torch.Size([1, 1, 4, 4])

I thought that the output tensor size with padding will be bigger than without padding.But it’s not the case. Can anyone please explain this?

Regards,
Tomojit

This should be expected.
Form the docs:

The `padding` argument effectively adds `dilation * (kernel_size - 1) - padding` amount of zero padding to both sizes of the input. This is set so that when a `Conv2d` and a `ConvTranspose2d` are initialized with same parameters, they are inverses of each other in regard to the input and output shapes. However, when `stride > 1` , `Conv2d` maps multiple input shapes to the same output shape. `output_padding` is provided to resolve this ambiguity by effectively increasing the calculated output shape on one side. Note that `output_padding` is only used to find output shape, but does not actually add zero-padding to output.

Also, have a look at the conv arithmetic tutorial for more information.

1 Like

Hi @ptrblck - Thanks for sharing the “conv arithmmetic” tutorial. I checked this tutorial before as well. I am still confused on padding. Consider the first case when I used padding=1. Note I didn’t use dilation. So, according to the PyTorch document (“dilation * (kernel_size - 1) - padding”) which equals to (0*(2-1)-1) = -1 amount of padding will be added. This does not make any sense to me. Am I doing any mistake?

@ptrblck - I also observe that when the stride is > 1 (say 2) the transpose Conv can’t reconstruct the original image size. But if I use unit stride then transpose Conv reconstructs the exact image size. See below:
Code snippet for perfect reconstruction:
In : import torch
In : D=torch.randn(1,1,28,28)
In : import torch.nn as nn
In : s,k,p=1,5,0
In : conv1 = nn.Conv2d(1,1,kernel_size=k,stride=s,padding=p)
In : deconv1 = nn.ConvTranspose2d(1,1,kernel_size=k,stride=s,padding=p)
In : D1=conv1(D)
In : D1_t = deconv1(D1)
In : D1_t.shape
Out: torch.Size([1, 1, 28, 28])==> size of the reconstructed image

Code snippet when reconstruction is not perfect when I use stride=2.
In : s,k,p=2,5,0
In : conv1 = nn.Conv2d(1,1,kernel_size=k,stride=s,padding=p)
In : deconv1 = nn.ConvTranspose2d(1,1,kernel_size=k,stride=s,padding=p)
In : D1 = conv1(D)
In : D1_t = deconv1(D1)
In : D.shape
Out: torch.Size([1, 1, 28, 28])
In : D1_t.shape
Out: torch.Size([1, 1, 27, 27]) ==> size of the reconstructed image

Hi
I want to use how the transpose convolution implemented in general for Generative Adversarial Networks using PyTorch framework. For example DCGAN Tutorial — PyTorch Tutorials 1.11.0+cu102 documentation the code taken from here. Is transpose convolution a combination of upsampling layer and convolution layer used or any other approach I really appreciate your help.

Thanks,
Vijay

The `Generator` uses `nn.ConvTranspose2d` layers so directly transposed convolutions (you can think about them as the “reversed” conv layers i.e. the forward pass of a transposed conv equals the backward of a vanialla conv layer and vice versa). For more information about these layers, check this repo.

1 Like

Hi,

Thank you very much for your quick response. So there won’t be any upsampling layer before applying convolution for this implementation? I thought transpose convolution = upsample layer+convolution layer from the paper https://arxiv.org/pdf/1806.01107.pdf fig 4.

I’m not familiar with this paper and don’t know if the authors call an upsampling + conv block a “transposed convolution”. You can see in the model architecture of the tutorial that `nn.ConvTranspose2d` is directly used and please check the posed link to see how this layer is working and increasing the spatial size of the input activation.

1 Like

Hi,
Thanks.I’m working on optimization of transpose convolution layer by avoiding the increasing input spatial size before applying convolution and to produce the same output. May I please know how the backend code works?