Output padding must be smaller than either stride or dilation

I’m trying to deconvolute a 3d image. I want the z-axis to remain the same while increasing the x,y image dimensions by a multiple of 2, so effectively just doubling the size.

When I am normally convolving and compressing an 8x96x96 (batch of 64 images) like so:

nn.Conv3d(nc, 32, 3, (1,2,2), 1)
nn.Conv3d(32, 32, 3, (1,2,2), 1)

I get a tensors of sizes (respectively):

torch.Size([64, 32, 8, 48, 48])
torch.Size([64, 32, 8, 24, 24])

When I deconvolute the compressed version of this which has a size of 1X3X3, like so:

nn.ConvTranspose3d(256, 64, 3, 2, padding=1, output_padding=1)   
nn.ConvTranspose3d(64, 64, 3, 2, padding=1, output_padding=1)

I get tensors of sizes (respectively):

torch.Size([64, 64, 2, 6, 6])
torch.Size([64, 64, 4, 12, 12])

However, and this is where I’m stuck. When I want to keep my z-axis constant while still increasing the stride of the y and x by 2 in my deconvolution I’m stuck. I’m trying this like so:

nn.ConvTranspose3d(32, 32, 3, (1,2,2), padding=1, output_padding=1),

When I try this, I’m getting the error in the title. Is there a way I can decompress my y and x dimension while keeping the z dimension constant? (preferably using stride and not a pooling layer)

1 Like

This code might help:

x = torch.randn(1, 256, 1, 3, 3)
conv = nn.ConvTranspose3d(in_channels=256,
                          out_channels=64,
                          kernel_size=(1, 4, 4),
                          stride=(1, 2, 2),
                          padding=(0, 1, 1),
                          output_padding=0)
output = conv(x)
print(output.shape)
> torch.Size([1, 64, 1, 6, 6])

Thanks a lot for your help @ptrblck. I just posted a question in the main posting channel about loading in multiple .npz files efficiently into a data loader. Maybe you have some more intuition on that?