How to pad a set of tensors to a specific height

peony · January 28, 2023, 3:33am

I am doing action recognition with mediapipe keypoints. These are the shapes of some of my tensors:

torch.Size([3, 3, 75]) torch.Size([3, 6, 75]) torch.Size([3, 10, 75]) torch.Size([3, 11, 75]) torch.Size([3, 9, 75]) torch.Size([3, 4, 75]) torch.Size([3, 21, 75])

The height of each tensor varies as they refer to the number of frames for each sample.

I have decided that I want to consider 8 frames for each sample. I understand I have to do padding and truncate (for heights above 8), but somehow just doing the padding worked, or so it seems. I wish to understand how my code worked.

if height < 8:
            source_pad = F.pad(tensor1, pad=(0, 0, 0, 8 - height))
        else:
            source_pad = F.pad(tensor1, pad=(0,0, 0, 8 - height))

ptrblck · January 28, 2023, 5:30am

Negative padding seems to slice the tensor as seen here:

x = torch.randn(3, 3, 75)
height = x.size(1)
source_pad = F.pad(x, pad=(0, 0, 0, 8 - height))
print(source_pad.shape)
# torch.Size([3, 8, 75])

x = torch.randn(3, 9, 75)
height = x.size(1)
source_pad = F.pad(x, pad=(0, 0, 0, 8 - height))
print(source_pad.shape)
# torch.Size([3, 8, 75])
print((x[:, :8, :] - source_pad).abs().sum())
# tensor(0.)

peony · January 28, 2023, 5:57am

Does slicing have the same effect as truncating? Thank you for explaining.

ptrblck · January 28, 2023, 7:49am

I’m unsure if our definition of “truncating” a tensor is the same, but “slicing” in this case you mean you are indexing the dimension from 0 to 7 (including) as seen in my comparison between the original tensor and the output:

print((x[:, :8, :] - source_pad).abs().sum())