Using Linear Layers to adjust output size

dllacer · October 7, 2020, 3:27pm

Hi everyone,

I have been working on an Autoencoder and I have a problem with the MaxPooling outputs: if the input size is odd (put 20x11), the output will be 10x5. So, when I want to recover the original size performing the deconvolution I get 20x10, but I need 20x11.

Mi idea is to put a Linear layer at the output of the Decoder to adjust the size, but I don’t know if it can negatively affect the learning of the model.

My question is: Is there any way to achieve the desired output size without adding a Linear layer? Like setting some parameter of the Conv2d, MaxPool or ConvTranspose2d layers or something like that

Thank you very much!

KarthikR · October 8, 2020, 7:24am

I believe there is a way to give the desired output size, to the ConvTranspose2d function. However, you need to play with the Kernel and Stride parameters, to make sure your input size to the function can be shaped to the output size you want. Also, you may want to look into how this helps your downstream task.

input = torch.randn(1, 16, 11, 11)
upsample = nn.ConvTranspose2d(16, 16, kernel_size=4, stride=2, padding=1)

mPool = nn.MaxPool2d((2, 2), stride=(2, 2))
print('Input Shape: ', input.shape)
input1 = mPool(input)
print('Shape after Max Pool: ', input1.shape)
output = upsample(input1)
print('Shape after upsample: ', output.size())
output = upsample(input1, output_size=input.size())
print('Shape after upsample, giving input size: ', output.size())

dllacer · October 8, 2020, 7:32am

Thank you very much, this is very useful!