I have been working on an Autoencoder and I have a problem with the MaxPooling outputs: if the input size is odd (put
20x11), the output will be
10x5. So, when I want to recover the original size performing the deconvolution I get
20x10, but I need
Mi idea is to put a Linear layer at the output of the Decoder to adjust the size, but I don’t know if it can negatively affect the learning of the model.
My question is: Is there any way to achieve the desired output size without adding a Linear layer? Like setting some parameter of the Conv2d, MaxPool or ConvTranspose2d layers or something like that
Thank you very much!
I believe there is a way to give the desired output size, to the ConvTranspose2d function. However, you need to play with the Kernel and Stride parameters, to make sure your input size to the function can be shaped to the output size you want. Also, you may want to look into how this helps your downstream task.
input = torch.randn(1, 16, 11, 11)
upsample = nn.ConvTranspose2d(16, 16, kernel_size=4, stride=2, padding=1)
mPool = nn.MaxPool2d((2, 2), stride=(2, 2))
print('Input Shape: ', input.shape)
input1 = mPool(input)
print('Shape after Max Pool: ', input1.shape)
output = upsample(input1)
print('Shape after upsample: ', output.size())
output = upsample(input1, output_size=input.size())
print('Shape after upsample, giving input size: ', output.size())
Thank you very much, this is very useful!