Autogradable image resize

Within Tensorflow, we can use tf.image.resize_images(img, img_h, img_w) to convert a feature map into another size.
How can we do the same thing in Pytorch?

1 Like

you can look at these Upsampling* modules to resize up:

To resize down, you can use AvgPool2d

It works for me.
Thank you very much.

Is there a way for fractional resize, e.g., 128x128 to 96x96?

Well, 128x128 -> 96x96 can be done by nn.UpsamplingNearest2d(scale_factor=3) followed by nn.AvgPool2d(4).
I hope I can use another direct way (with interpolation).
Thank you :slight_smile:

You can try FractionalMaxPooling.

Has this been implemented yet for float scale factors? Greatly appreciated!

Jwin, I had the same problem and I solved using a sampling grid:

def downsampling(x, size=None, scale_factor=None, mode='nearest'):
	# define size if user has specified scale_factor
	if size is None: size = (int(scale_factor*x.size(2)), int(scale_factor*x.size(3)))
	# create coordinates
	h = torch.arange(0,size[0]) / (size[0]-1) * 2 - 1
	w = torch.arange(0,size[1]) / (size[1]-1) * 2 - 1
	# create grid
	grid = torch.zeros(size[0],size[1],2)
	grid[:,:,0] = w.unsqueeze(0).repeat(size[0],1)
	grid[:,:,1] = h.unsqueeze(0).repeat(size[1],1).transpose(0,1)
	# expand to match batch size
	grid = grid.unsqueeze(0).repeat(x.size(0),1,1,1)
	if x.is_cuda: grid = grid.cuda()
	# do sampling
	return F.grid_sample(x, grid, mode=mode)

You can either use nearest or bilinear interpolation, depending on your needs. In my case I was downsizing a semantic layout, so I couldn’t use any type of pooling (non-integer values wheren’t allowed and a cross integer value between two classes didn’t have sense).
I didn’t try this in upsampling. Notice that it only works with 4D tensors as input, so if you need to use it with a single image, before you have to .unsqueeze(0) it.


My $0.02 about tf.image.resize_images and the state of related functionality in PyTorch.

Does anyone have a solution for autogradable general nearest neighbours upsampling?

1 Like

The implementation of nn.Upsample in >= version 0.4.x has a size kwarg, which allows you to specify an arbitrary height and width for the tensor. That handles upsampling, but the API is, so far, not as general as tf.image.resize_images, which can shrink tensors, too.

Unfortunately the size cannot be any size. The dimensions must be multiples of the input dimensions, otherwise you get an error message.

If you use the default mode='nearest'.
Change it to mode='bilinear' and it should run:

x = torch.randn(1, 3, 24, 24)
up = nn.Upsample(size=47, mode='bilinear')
y = up(x)
> torch.Size([1, 3, 47, 47])

Is there any plan to support ‘nearest’ ? As in many dense estimation task, a edge preserving property is important, using bilinear downsample will hurt that property…

1 Like

I want to resize tensor with shape [n] to [n, m]. what is the best solution?

You cannot resize or view this tensor using these shapes, as the second one would have more elements.
If you would like to repeat the elements of the first tensor m times, you could use tensor.unsqueeze(0).repeat(1, m) or .extend(-1, m).

Got it. Thanks a lot