Upsampling for scaling images both up and down

spacemeerkat · March 28, 2019, 9:38am

I’ve been using the torch.nn.Upsample method for scaling up images to different sizes as follows:

import torch
import numpy as np

a = np.random.uniform(0,1,(10,10))
a = torch.tensor(a)
a = a.unsqueeze(0).unsqueeze(0)

upsampler = torch.nn.Upsample(size=20, mode=‘bilinear’)

a_sized_up = upsampler(a)

print(a_sized_up.shape)

Out[23]: torch.Size([1, 1, 20, 20])

But my question is: are there any reasons why one cannot use the same method to downscale images if they are bigger than the required input size to some model?

For example one could use:

upsampler_down = torch.nn.Upsample(size=5, mode=‘bilinear’)
a_sized_down = upsampler_down(a)
print(a_sized_down.shape)

Out[33]: torch.Size([1, 1, 5, 5])

Which seems to work fine and looks sensible but does this make sense from a mathematical standpoint or will this method be introducing problems somehow in the background?

Many thanks in advance for advice on this

Oli · March 28, 2019, 9:43am

I believe they changed the name of Upsample to interpolate to not confuse people. It feels weird that upsample can create smaller images Check out interpolate docs

spacemeerkat · March 28, 2019, 9:47am

Ahhh okay so same function under the hood but a different name to avoid confusion. Thanks for pointing me in the right direction!

So presumably the interpolation function still uses transposed convolution to upscale images and simply performs convolution to downsample?

Oli · March 28, 2019, 9:54am

Welcome I don’t think the upscale ever did transposed convolution. I’m guessing it just interpolated to upscale but could be wrong. There is something called ConvTranspose2d for that now though.

justusschock · March 28, 2019, 9:56am

Yeah, it certainly only does interpolation without any learning. You can choose the interpolation mode. If you need learnable interpolation, conv(transpose) or the grid_sample (maybe along with the affine_grid fn) could be a way to go.

spacemeerkat · March 28, 2019, 9:59am

That’s okay then, I’ve been using ConvTranspose2d (as mentioned by @Oli) for upsampling in a learnable fashion in networks but for simple rescaling of images before feeding them to a model interpolation seems to be fine.

spacemeerkat · March 28, 2019, 10:00am

Interpolating for downscaling is still a little strange in my mind…would this be a simple convolution/max pooling style operation?

justusschock · March 28, 2019, 10:02am

Not like maxpooling, but probably (depending on your interpolation type) like average pooling.