Within Tensorflow
, we can use tf.image.resize_images(img, img_h, img_w)
to convert a feature map into another size.
How can we do the same thing in Pytorch
?
you can look at these Upsampling* modules to resize up:
http://pytorch.org/docs/nn.html#vision-layers
To resize down, you can use AvgPool2d
It works for me.
Thank you very much.
Hello,
Is there a way for fractional resize, e.g., 128x128 to 96x96?
Well, 128x128 -> 96x96 can be done by nn.UpsamplingNearest2d(scale_factor=3) followed by nn.AvgPool2d(4).
I hope I can use another direct way (with interpolation).
Thank you
Has this been implemented yet for float scale factors? Greatly appreciated!
Jwin, I had the same problem and I solved using a sampling grid:
def downsampling(x, size=None, scale_factor=None, mode='nearest'):
# define size if user has specified scale_factor
if size is None: size = (int(scale_factor*x.size(2)), int(scale_factor*x.size(3)))
# create coordinates
h = torch.arange(0,size[0]) / (size[0]-1) * 2 - 1
w = torch.arange(0,size[1]) / (size[1]-1) * 2 - 1
# create grid
grid = torch.zeros(size[0],size[1],2)
grid[:,:,0] = w.unsqueeze(0).repeat(size[0],1)
grid[:,:,1] = h.unsqueeze(0).repeat(size[1],1).transpose(0,1)
# expand to match batch size
grid = grid.unsqueeze(0).repeat(x.size(0),1,1,1)
if x.is_cuda: grid = grid.cuda()
# do sampling
return F.grid_sample(x, grid, mode=mode)
You can either use nearest or bilinear interpolation, depending on your needs. In my case I was downsizing a semantic layout, so I couldnât use any type of pooling (non-integer values wherenât allowed and a cross integer value between two classes didnât have sense).
I didnât try this in upsampling. Notice that it only works with 4D tensors as input, so if you need to use it with a single image, before you have to .unsqueeze(0) it.
Does anyone have a solution for autogradable general nearest neighbours upsampling?
The implementation of nn.Upsample in >= version 0.4.x has a size
kwarg, which allows you to specify an arbitrary height and width for the tensor. That handles upsampling, but the API is, so far, not as general as tf.image.resize_images
, which can shrink tensors, too.
Unfortunately the size cannot be any size. The dimensions must be multiples of the input dimensions, otherwise you get an error message.
If you use the default mode='nearest'
.
Change it to mode='bilinear'
and it should run:
x = torch.randn(1, 3, 24, 24)
up = nn.Upsample(size=47, mode='bilinear')
y = up(x)
print(y.shape)
> torch.Size([1, 3, 47, 47])
Is there any plan to support ânearestâ ? As in many dense estimation task, a edge preserving property is important, using bilinear downsample will hurt that propertyâŚ
Hi,
I want to resize tensor with shape [n] to [n, m]. what is the best solution?
You cannot resize or view this tensor using these shapes, as the second one would have more elements.
If you would like to repeat the elements of the first tensor m
times, you could use tensor.unsqueeze(0).repeat(1, m)
or .extend(-1, m)
.
Got it. Thanks a lot