Autoencoder - help with nn.sequential

Morning all,

I am hoping that someone will be able to help me as I am at the end of my tether…

I am trying to write the autoencoder for my network. having hit a problem i started experiementing with a know working decoder (shown below):
class Decoder(nn.Module):

def __init__(self, z_dim):
    super(Decoder, self).__init__()
    self.fc1 = nn.Linear(z_dim, 672)
    self.conv1 = nn.ConvTranspose1d(32, 32, 8, 2, padding=3)
    self.conv2 = nn.ConvTranspose1d(32, 32, 8, 2, padding=3)
    self.conv3 = nn.ConvTranspose1d(32, 16, 8, 2, padding=3)
    self.conv4 = nn.ConvTranspose1d(16, 16, 8, 2, padding=3)
    self.conv5 = nn.ConvTranspose1d(16, 1, 7, 1, padding=3)
    self.bn1 = nn.BatchNorm1d(32)
    self.bn2 = nn.BatchNorm1d(32)
    self.bn3 = nn.BatchNorm1d(16)
    self.bn4 = nn.BatchNorm1d(16)
    self.relu = nn.ReLU()

def forward(self, z):
    z = self.relu(self.fc1(z))
    print(z)
    print(z.shape)
   #z = F.dropout(z, 0.3)
    z = z.view(-1, 32, 21)
    z = self.relu(self.conv1(z))
    z = self.bn1(z)
    #z = F.dropout(z, 0.3)
    z = self.relu(self.conv2(z))
    z = self.bn2(z)
    #z = F.dropout(z, 0.3)
    z = self.relu(self.conv3(z))
    z = self.bn3(z)
    #z = F.dropout(z, 0.3)
    z = self.relu(self.conv4(z))
    z = self.bn4(z)
    #z = F.dropout(z, 0.3)
    z = self.conv5(z)
    recon = torch.sigmoid(z)
    return recon

summary(vae.decoder, (1, 2))

this gives the following results:

**tensor([[[0.0000, 0.3415, 0.6484,  ..., 0.2840, 0.0000, 0.3873]],**

**        [[0.0000, 0.4126, 0.6034,  ..., 0.2224, 0.0000, 0.3813]]],**
**       device='cuda:0', grad_fn=<ReluBackward0>)**
**torch.Size([2, 1, 672])**
**----------------------------------------------------------------**
**        Layer (type)               Output Shape         Param #**
**================================================================**
**            Linear-1               [-1, 1, 672]           2,016**
**              ReLU-2               [-1, 1, 672]               0**
**   ConvTranspose1d-3               [-1, 32, 42]           8,224**
**              ReLU-4               [-1, 32, 42]               0**
**       BatchNorm1d-5               [-1, 32, 42]              64**
**   ConvTranspose1d-6               [-1, 32, 84]           8,224**
**              ReLU-7               [-1, 32, 84]               0**
**       BatchNorm1d-8               [-1, 32, 84]              64**
**   ConvTranspose1d-9              [-1, 16, 168]           4,112**
**             ReLU-10              [-1, 16, 168]               0**
**      BatchNorm1d-11              [-1, 16, 168]              32**
**  ConvTranspose1d-12              [-1, 16, 336]           2,064**
**             ReLU-13              [-1, 16, 336]               0**
**      BatchNorm1d-14              [-1, 16, 336]              32**
**  ConvTranspose1d-15               [-1, 1, 336]             113**
**================================================================**
**Total params: 24,945**
**Trainable params: 24,945**
**Non-trainable params: 0**
**----------------------------------------------------------------**
**Input size (MB): 0.00**
**Forward/backward pass size (MB): 0.29**
**Params size (MB): 0.10**
**Estimated Total Size (MB): 0.38**
**----------------------------------------------------------------**
**which is expected.**

however changing the decoder to be included in a nn.sequential module the summary no longer works (see below).

class Decoder(nn.Module):
    def __init__(self, z_dim):
        super(Decoder, self).__init__()
        self.decode = nn.Sequential(OrderedDict([
            ('fc1',nn.Linear(z_dim, 672)),
            ('conv1',nn.ConvTranspose1d(32, 32, 8, 2, padding=3)),
            ('conv2',nn.ConvTranspose1d(32, 32, 8, 2, padding=3)),
            ('conv3',nn.ConvTranspose1d(32, 16, 8, 2, padding=3)),
            ('conv4',nn.ConvTranspose1d(16, 16, 8, 2, padding=3)),
            ('conv5',nn.ConvTranspose1d(16, 1, 7, 1, padding=3)),
            ('bn1',nn.BatchNorm1d(32)),
            ('bn2',nn.BatchNorm1d(32)),
            ('bn3',nn.BatchNorm1d(16)),
            ('bn4',nn.BatchNorm1d(16)),
            ('relu',nn.ReLU())
        ]))
        
        
    def forward(self, z):
        z = self.decode.relu(self.decode.fc1(z))
        print(z)
        print(z.shape)
        z = z.view(-1, 32, 31)
        z = self.decode(z)
        #z = self.relu(self.fc1(z))
        #z = F.dropout(z, 0.3)
        recon = torch.sigmoid(z)
        return recon
summary(vae.decoder, (1, 2))

This gives the following error:

tensor([[[0.6222, 0.0000, 0.0000,  ..., 0.0180, 0.3997, 0.3615]],

        [[0.7606, 0.0000, 0.0000,  ..., 0.0000, 0.2021, 0.2654]]],
       device='cuda:0', grad_fn=<ReluBackward0>)
torch.Size([2, 1, 672])
Traceback (most recent call last):

  File "G:/decoder-test.py", line 30, in forward
    z = z.view(-1, 32, 31)

RuntimeError: shape '[-1, 32, 31]' is invalid for input of size 1344

I know I have done something silly, but I cannot for the life of me see what (removing the z = self.decode.relu(self.decode.fc1(z)) line just changes the size from 1344 to 4)…

I’m hopng that someone can point me in the correct direction…

Chaslie

Did you mean to do z = z.view(-1, 32, 21)?

But, I’m not sure the code will work on the next line where you do z = self.decode(z) anyways. But one step at a time :wink:

hi Olaf,

thanks for spotting that, yes i did mean z.view(-1,32,21), but know i haave the following error

output = input.matmul(weight.t())

RuntimeError: size mismatch, m1: [64 x 21], m2: [2 x 672] at C:/w/1/s/tmp_conda_3.7_044431/conda/conda-bld/pytorch_1556686009173/work/aten/src\THC/generic/THCTensorMathBlas.cu:268

commenting out “z = self.decode.relu(self.decode.fc1(z))” (as i am double accounting)

gives this error

File “G:/decoder-test.py”, line 30, in forward
z = z.view(-1, 32, 21)

RuntimeError: shape ‘[-1, 32, 21]’ is invalid for input of size 4

I think you should look further into how the nn.Sequential model works. From my understanding it is used when you want a series of layers to always execute after one another, but it has to be put into the sequential in that order. Honestly, I think your first code was fine. Why did you want to put it into sequential in the first place?

Example of sequential, but without the dropout (which you should probably not use if you use batchnorm)

self.decode = nn.Sequential(OrderedDict([
            ('conv1',nn.ConvTranspose1d(32, 32, 8, 2, padding=3)),
            ('bn1',nn.BatchNorm1d(32)),
            ('conv2',nn.ConvTranspose1d(32, 32, 8, 2, padding=3)),
            ('bn2',nn.BatchNorm1d(32)),
            ...
        ]))

Hi Olaf,

Sorry for the delay. The reason I chose to use sequential is that i understood that it ran quicker and was more efficient…

I’m just trying your suggestion know, I will let you know if solves the problem.

chaslie

1 Like

hi Olaf,

thanks for the suggestion but that didn’t get over the error.

        self.decode = nn.Sequential(OrderedDict([
            ('fc1',nn.Linear(z_dim, 672)),
            ('relu1',nn.ReLU()),
            ('conv1',nn.ConvTranspose1d(32, 32, 8, 2, padding=3)),
            ('bn1',nn.BatchNorm1d(32)),
            ('conv2',nn.ConvTranspose1d(32, 32, 8, 2, padding=3)),
            ('bn2',nn.BatchNorm1d(32)),
            ('conv3',nn.ConvTranspose1d(32, 16, 8, 2, padding=3)),
            ('bn3',nn.BatchNorm1d(32)),
            ('conv4',nn.ConvTranspose1d(16, 16, 8, 2, padding=3)),
            ('bn4',nn.BatchNorm1d(32)),
            ('conv5',nn.ConvTranspose1d(16, 1, 7, 1, padding=3)),
        ]))
        
        
    def forward(self, z):
        z = self.decode.relu1(self.decode.fc1(z))
        print(z)
        print(z.shape)
        z = z.view(-1, 32, 21)
        print(z)
        print(z.shape)
        z = self.decode(z)
        print(z)
        print(z.shape)
        #z = self.relu(self.fc1(z))
        #z = F.dropout(z, 0.3)
        recon = torch.sigmoid(z)
        return recon

gives the following error:

tensor([[[0.3054, 0.0000, 0.0000,  ..., 0.0000, 0.4342, 0.0000]],

        [[0.4361, 0.0000, 0.0000,  ..., 0.0206, 0.5194, 0.0000]]],
       device='cuda:0', grad_fn=<ReluBackward0>)
torch.Size([2, 1, 672])
tensor([[[0.3054, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
         [0.4938, 0.0000, 0.0000,  ..., 0.0000, 0.2481, 0.9124],
         [0.5702, 0.0000, 0.0000,  ..., 0.4131, 0.1147, 0.0842],
         ...,
         [0.0000, 0.1210, 0.5313,  ..., 0.6218, 0.3895, 0.0000],
         [0.0000, 0.0000, 0.0000,  ..., 0.0850, 0.5332, 0.1528],
         [0.2276, 0.0021, 0.0000,  ..., 0.0000, 0.4342, 0.0000]],

        [[0.4361, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
         [0.3129, 0.0000, 0.0000,  ..., 0.0000, 0.3701, 1.3570],
         [0.8684, 0.0000, 0.0000,  ..., 0.2555, 0.4145, 0.0000],
         ...,
         [0.0000, 0.4524, 0.8189,  ..., 0.8206, 0.2301, 0.0000],
         [0.0000, 0.0000, 0.0000,  ..., 0.4690, 0.3480, 0.0000],
         [0.0000, 0.0000, 0.0000,  ..., 0.0206, 0.5194, 0.0000]]],
       device='cuda:0', grad_fn=<ViewBackward>)
torch.Size([2, 32, 21])
Traceback (most recent call last):

RuntimeError: size mismatch, m1: [64 x 21], m2: [2 x 672] at C:/w/1/s/tmp_conda_3.7_044431/conda/conda-bld/pytorch_1556686009173/work/aten/src\THC/generic/THCTensorMathBlas.cu:268

I don’t think you can include a view into your sequential. Saw a post where they tried to make it into a layer but I wouldn’t bother. You’re not going to see that much speedup anyways. I suggest that you do something like this

import torch
import torch.nn as nn
import torch.nn.functional as F
from collections import OrderedDict

class MyModel(nn.Module):
  def __init__(self, zdim):
    super().__init__()
    self.fc1 = nn.Linear(zdim, 672)
    self.decode = nn.Sequential(OrderedDict([
            ('conv1',nn.ConvTranspose1d(32, 32, 8, 2, padding=3)),
            ('relu1',nn.ReLU()),
            ('bn1',nn.BatchNorm1d(32)),
            ('conv2',nn.ConvTranspose1d(32, 32, 8, 2, padding=3)),
            ('relu2',nn.ReLU()),
            ('bn2',nn.BatchNorm1d(32)),
            ('conv3',nn.ConvTranspose1d(32, 16, 8, 2, padding=3)),
            ('relu3',nn.ReLU()),
            ('bn3',nn.BatchNorm1d(16)),
        ]))
        
        
  def forward(self, z):
    z = F.relu(self.fc1(z))
    print(z)
    print(z.shape)
    z = z.view(-1, 32, 21)
    z = self.decode(z)
    print(z.shape)


zdim = 10
model = MyModel(zdim)
inputs = torch.randn(2, zdim)
outputs = model(inputs)

Oli,

thanks for this you are a life saver. I tried putting the fc1 layer into a another sequential slot but that didn’t work.

Your solution is neat and works…

Chaslie

1 Like