ConvTranspose2d output_padding position

sharvil · April 17, 2020, 10:06am

I have a convolution transpose that’s defined like this:

nn.ConvTranspose2d(1, 1, [1, 30], stride=[1, 15], padding=[1, 8], output_padding=[0, 1])

This describes the inverse operation for a convolution that is padded by 8 pixels on the left, and 7 pixels on the right with a stride of 15 pixels.

Is there a way for me to specify that I want this transposed convolution to be the inverse for a convolution that’s padded by 7 pixels on the left and 8 pixels on the right instead? In other words, I want output_padding to remove 1 pixel of padding from the left side instead of the right side.

ptrblck · April 18, 2020, 1:59am

If I’m not mistaken, output_padding just adds the padding after the transposed convolution was applied.
If that’s the case, you could remove the output padding and instead slice and pad the output manually via F.pad.

sharvil · April 20, 2020, 9:55pm

I don’t think that’s what output_padding does. It prevents some number of elements from being treated as padding (and thus removed from the output).

Consider this transpose operation:

torch.nn.Conv1d(1, 1, kernel_size=30, stride=15, padding=8, output_padding=1)

The equivalent of the above operation without specifying padding or output_padding is:

padding = 8
output_padding = 1
y = torch.nn.Conv1d(1, 1, kernel_size=30, stride=15)(x)
y[:, :, padding:-(padding - output_padding)]

The issue is that output_padding is applied to the right side, and I can’t find a way to apply it on the left instead. A natural solution would be to accept a tuple for padding, just like np.pad or any of the other PyTorch padding ops that take separate left and right padding values. That approach would be more consistent, more flexible, and eliminate output_padding entirely.

Admittedly, I could just manually extract the parts of the tensor I need and ignore padding / output_padding entirely, but I’m hoping that I just missed something and that there’s a better way to get the result I’m looking for.

(BTW, this post was motivated by the fact that TensorFlow applies the equivalent of output_padding on the left whereas PyTorch applies it on the right. This difference prevents model conversion between TF and PyTorch.)

ptrblck · April 20, 2020, 10:14pm

I see, thanks for the code snippet (I assume the conv layer should be nn.ConvTranspose1d).

I don’t think you can easily change it, but might instead have to slip the inputs and kernel if you compare the TF and PyTorch implementations.
I’m not sure, if TF flips the kernel anyway, but I think Theano flipped it internally to use a convolution instead of a cross-correlation.

sharvil · April 20, 2020, 10:25pm

Thanks for the response. Yes, you’re right – those should’ve been nn.ConvTranspose1d.

TF does cross-correlation, same as PyTorch. I’ve confirmed by writing some framework-independent C++ code that the only delta between the two is their preference on which side to remove more elements from.