Output_size when using nn.Sequential

aroibu1 · March 26, 2021, 5:37pm

Hi! In my network, I am trying to carry out upsampling from an encoded latent space. So far I have been using nearest upsampling, but I want to switch to transpose convolutions as this could aid in my training.

The way I wish to define my upsampling operation, to keep it consistent with the rest of my code, is as follows:

a = nn.Sequential(
            OrderedDict(
                [
                    (
                        'test' + "conv1",
                        nn.ConvTranspose3d(
                            in_channels=1,
                            out_channels=1,
                            kernel_size=3,
                            stride=2,
                            padding=padding,
                            bias= False,
                        ),
                    ),
                    (
                        'test' + 'normalization1',
                        normalization,
                    ),
                    (
                        'test' + 'nonlinearity1',
                        nonlinearity,
                    ),
                ]
            )
        )

A typical input to the upsampling operation would be of shape (1,1,5,6,5) with the output being of shape (1,1,10,12,10). However, due to the convolution kernel size, the output that I am getting is of shape (1,1,9,11,9).

Now comes my question: I read that I can use an argument output_size=desired_shape in the forward call of the ConvTranspose to get the right dimension. This works when I use it solely on the ConvTranspose3d, but raises a TypeError: forward() got an unexpected keyword argument 'output_size' when I try using it on teh Sequential block defined above. So, does anybody know if and how I could fix this issue, while still retaining my Sequential block approach? Cheers!

aroibu1 · April 9, 2021, 3:19pm

Does anybody have any ideas regarding this? Sorry, but it has been two weeks since my original post and I am still struggling with this. Cheers!

ptrblck · April 10, 2021, 12:48am

If the output_size argument is fixed for each layer, you could create a custom module and set this argument during its initialization:

class MyConvTranspose3d(nn.Module):
    def __init__(self, conv, output_size=None):
        super().__init__()
        self.conv = conv
        self.output_size = output_size
        
    def forward(self, x):
        x = self.conv(x, output_size=self.output_size)
        return x


model = nn.Sequential(
    MyConvTranspose3d(
        nn.ConvTranspose3d(
            in_channels=1,
            out_channels=1,
            kernel_size=3,
            stride=2,
            padding=1,
            bias= False,
        ),
        output_size=(10, 12, 10)
    )
)

x = torch.randn(1, 1, 5, 6, 5)
out = model(x)
print(out.shape)
> torch.Size([1, 1, 10, 12, 10])

Otherwise, if this argument depends on the input shape, I would recommend to use a custom module and not an nn.Sequential container for the needed flexibility.