Cropping a minibatch of images, each image a bit differently

I have a tensor named input with dimensions 64x21x21. It is a minibatch of 64 images, each 21x21 pixels. I’d like to crop each image down to 11x11 pixels. So the output tensor I want would have dimensions 64x11x11.

I’d like to crop each image around a different “center pixel.” The center pixels are given by a 2-dimensional long tensor named center with dimensions 64x2. For image i, center[i][0] gives the row index and center[i][1] gives the column index for the pixel that should be at the center in the output. We can assume that the center pixel is always at least 5 pixels away from the border.

Is there an efficient way to do this in pytorch (on the gpu)?

[NB: I posted this on stackexchange earlier, but I suspect I’m more likely to get a answer here.]

1 Like

it’s a bit more advanced usage, but you can do this efficiently on the GPU using the grid_sample method. It implements warping, given a 2D flow-field.


Fantastic! I’d almost given up on finding an efficient way to do it. Thanks a lot!

What do you mean by flow-field?

>>> output = F.grid_sample(input, flow-field)

According to doc:

input (Variable) – input batch of images (N x C x IH x IW)
grid (Variable) – flow-field of size (N x OH x OW x 2)

Why does flow-field have size (N x OH x OW x 2)?
Does the output of grid_sample have same size as input?

Hope that could help

def build_grid(source_size,target_size):
    k = float(target_size)/float(source_size)
    direct = torch.linspace(0,k,target_size).unsqueeze(0).repeat(target_size,1).unsqueeze(-1)
    full =[direct,direct.transpose(1,0)],dim=2).unsqueeze(0)
    return full.cuda()

def random_crop_grid(x,grid):
    delta = x.size(2)-grid.size(1)
    grid = grid.repeat(x.size(0),1,1,1).cuda()
    #Add random shifts by x
    grid[:,:,:,0] = grid[:,:,:,0]+ torch.FloatTensor(x.size(0)).cuda().random_(0, delta).unsqueeze(-1).unsqueeze(-1).expand(-1, grid.size(1), grid.size(2)) /x.size(2)
    #Add random shifts by y
    grid[:,:,:,1] = grid[:,:,:,1]+ torch.FloatTensor(x.size(0)).cuda().random_(0, delta).unsqueeze(-1).unsqueeze(-1).expand(-1, grid.size(1), grid.size(2)) /x.size(2)
    return grid
#We want to crop a 80x80 image randomly for our batch
#Building central crop of 80 pixel size 
grid_source = build_grid(batch.size(2),80)
#Make radom shift for each batch
grid_shifted = random_crop_grid(batch,grid_source)
#Sample using grid sample
sampled_batch = F.grid_sample(batch, grid_shifted)

direct = torch.linspace(0,k,target_size).unsqueeze(0).repeat(target_size,1).unsqueeze(-1)
is wrong, I got zoomed in image.

range should be in range [-1,1]
the correct way
direct = torch.linspace(-k,k,target_size).unsqueeze(0).repeat(target_size,1).unsqueeze(-1)