Problems using torch.nn.functional.pad(...)

pyto · June 13, 2020, 11:47am

Hey,

So I was wondering about the padding function of nn.functional:

I am working with 3d images which, outside of the net need padding for some processing.
pad() works fine when just leaving the mode at its default. But trying to use mode=‘reflect’ or mode=‘replicate’ doesn’t work for 3d images it seems.

I can’t access my code right now, but I think I tried to programm it manually like this with a,b,c being the count of lines to be padded on:

shape = img.shape
img_padded = F.pad(img, (0, a, 0, b, 0, c))

#1) replication on first dimension seems to work fine like this
img_padded[shape[0]:, :shape[1], :shape[2]] = img_padded[shape[0]-1, :shape[1], :shape[2]]

#2) this doesn’t work anymore though
img_padded[:, shape[1]:, :shape[2]] = img_padded[:, shape[1]-1, :shape[2]]
img_padded[:, :, shape[2]:] = img_padded[:, :, shape[2]-1]

What I really don’t get is why it works for 1) and not for 2). I could understand why it would not work like this at all though.
Anyway, is there any nice solution to do replication or reflection padding in 3 dimensional images?

ptrblck · June 14, 2020, 3:53am

I’m not completely sure about the shapes you are using, but “3d images” should have 5 dimensions as [batch_size, channels, depth, height, width].
Your current padding call would pad the depth, height, and width dimension. Not the batch and channels dimensions. In the code snippet to compare some outputs, it seems you are trying to index the batch, channel, and depth dimensions, which seems to be wrong.

Could you check the shape before and after padding and rerun the comparison?

pyto · June 14, 2020, 11:33am

Hm ok… I actually used only 3 dimensions. Since it is outside the neural net and only one channel image anyway, batch and channel don’t play a role here. I really just want to change the shape of the image itself. That works great so far. But only with the default padding mode.

So for example if I want multiples of 128:

img.shape:
(279, 510, 510)

padded_img.shape:
(384, 512, 512)

So I am actually trying to index the depth, height and width and there is no batch and channels.

ptrblck · June 15, 2020, 2:37am

From the docs:

Constant padding is implemented for arbitrary dimensions.
Replicate padding is implemented for padding the last 3 dimensions of 5D input
tensor, or the last 2 dimensions of 4D input tensor, or the last dimension of
3D input tensor. Reflect padding is only implemented for padding the last 2
dimensions of 4D input tensor, or the last dimension of 3D input tensor.

Your batch and channel dimensions should be 1 in this case.
Try to add them via:

input = input.unsqueeze(0).unsqueeze(1)

and rerun your code.

pyto · June 15, 2020, 11:07am

Thank you! I will try right away