How can I crop away a tensor's constant value padding (padding height and width are the same) with an unknown padding value and size?

ProGamerGov · December 20, 2020, 4:04pm

How can I crop away a tensor’s constant value padding (padding height and width are the same) with an unknown value and size?

I would think that because the padding surrounding my tensor has a constant value, and the same height / width, that it should be possible to know where to crop the tensor to remove the padding.

import torch

# Test tensor with NCHW dimensions
a = torch.randn(1,4,5,5) # Can have any H & W size

# Create padding for testing
b = torch.nn.functional.pad(a, (2,2,2,2), 'constant', value=1.2) # Value can be any number

c = # equal to a, without being able to use the variables a or b (or their argument values)

ptrblck · January 2, 2021, 4:40am

It depends a bit on the padding itself. E.g. if you know that the padding will be applied to all spatial sides with a constant value, you could just grab the very first value, and remove it via indexing. However, since you cannot access a and b, I assume you know the padding size?
If so, you could index a using this size, but again you are apparently not allowed to use a.
Could you clarify this restriction a bit, since you won’t be able to create c without using a or b at all?

John_Grabner · October 2, 2021, 4:32pm

I have a similar, maybe the same problem.

An image_hw with values between 0.0 and 1.0, where the non-zero pixels are somewhere in the center (i.e. unknown padding on left, right, top, and bottom, maybe no padding).

An example image:
not_trimmed_mask_1_0_11

I need a bounding box coordinates so that is can rectangle crop center region with non-zero pixels. image_hw[top:bottom, left:right]

I can do this with 4 for loops, similar to:

for top_pad in range(image_hw.shape[0];
   if image_hw[top_pad,:].sum()>0:
      break;

I suspect pytorch and a function that could accomplish this more concisely.

ptrblck · October 2, 2021, 11:44pm

I think you could use nonzero() on the image and grab the min. and max. values as see here:

x = torch.zeros(24, 24)
x[3:7, 5:9] = 1.

idx = x.nonzero()
x_min = idx[:, 0].min()
x_max = idx[:, 0].max()
y_min = idx[:, 1].min()
y_max = idx[:, 1].max()