I have a convolution transpose that’s defined like this:
nn.ConvTranspose2d(1, 1, [1, 30], stride=[1, 15], padding=[1, 8], output_padding=[0, 1])
This describes the inverse operation for a convolution that is padded by 8 pixels on the left, and 7 pixels on the right with a stride of 15 pixels.
Is there a way for me to specify that I want this transposed convolution to be the inverse for a convolution that’s padded by 7 pixels on the left and 8 pixels on the right instead? In other words, I want
output_padding to remove 1 pixel of padding from the left side instead of the right side.
If I’m not mistaken,
output_padding just adds the padding after the transposed convolution was applied.
If that’s the case, you could remove the output padding and instead slice and pad the output manually via
I don’t think that’s what
output_padding does. It prevents some number of elements from being treated as padding (and thus removed from the output).
Consider this transpose operation:
torch.nn.Conv1d(1, 1, kernel_size=30, stride=15, padding=8, output_padding=1)
The equivalent of the above operation without specifying
padding = 8
output_padding = 1
y = torch.nn.Conv1d(1, 1, kernel_size=30, stride=15)(x)
y[:, :, padding:-(padding - output_padding)]
The issue is that
output_padding is applied to the right side, and I can’t find a way to apply it on the left instead. A natural solution would be to accept a tuple for
padding, just like
np.pad or any of the other PyTorch padding ops that take separate left and right padding values. That approach would be more consistent, more flexible, and eliminate
Admittedly, I could just manually extract the parts of the tensor I need and ignore
output_padding entirely, but I’m hoping that I just missed something and that there’s a better way to get the result I’m looking for.
(BTW, this post was motivated by the fact that TensorFlow applies the equivalent of
output_padding on the left whereas PyTorch applies it on the right. This difference prevents model conversion between TF and PyTorch.)
I see, thanks for the code snippet (I assume the conv layer should be
I don’t think you can easily change it, but might instead have to slip the inputs and kernel if you compare the TF and PyTorch implementations.
I’m not sure, if TF flips the kernel anyway, but I think Theano flipped it internally to use a convolution instead of a cross-correlation.
Thanks for the response. Yes, you’re right – those should’ve been
TF does cross-correlation, same as PyTorch. I’ve confirmed by writing some framework-independent C++ code that the only delta between the two is their preference on which side to remove more elements from.