I’ve been using the
torch.nn.Upsample method for scaling up images to different sizes as follows:
import numpy as np
a = np.random.uniform(0,1,(10,10))
a = torch.tensor(a)
a = a.unsqueeze(0).unsqueeze(0)
upsampler = torch.nn.Upsample(size=20, mode=‘bilinear’)
a_sized_up = upsampler(a)
Out: torch.Size([1, 1, 20, 20])
But my question is: are there any reasons why one cannot use the same method to downscale images if they are bigger than the required input size to some model?
For example one could use:
upsampler_down = torch.nn.Upsample(size=5, mode=‘bilinear’)
a_sized_down = upsampler_down(a)
Out: torch.Size([1, 1, 5, 5])
Which seems to work fine and looks sensible but does this make sense from a mathematical standpoint or will this method be introducing problems somehow in the background?
Many thanks in advance for advice on this
I believe they changed the name of Upsample to interpolate to not confuse people. It feels weird that upsample can create smaller images Check out interpolate docs
Ahhh okay so same function under the hood but a different name to avoid confusion. Thanks for pointing me in the right direction!
So presumably the interpolation function still uses transposed convolution to upscale images and simply performs convolution to downsample?
Welcome I don’t think the upscale ever did transposed convolution. I’m guessing it just interpolated to upscale but could be wrong. There is something called ConvTranspose2d for that now though.
Yeah, it certainly only does interpolation without any learning. You can choose the interpolation mode. If you need learnable interpolation, conv(transpose) or the grid_sample (maybe along with the affine_grid fn) could be a way to go.
That’s okay then, I’ve been using ConvTranspose2d (as mentioned by @Oli) for upsampling in a learnable fashion in networks but for simple rescaling of images before feeding them to a model interpolation seems to be fine.
Interpolating for downscaling is still a little strange in my mind…would this be a simple convolution/max pooling style operation?
Not like maxpooling, but probably (depending on your interpolation type) like average pooling.