Error in Autoencoder

randinoo · May 18, 2021, 10:35pm

``class Autoencoder(nn.Module):
def init(self):
super(Autoencoder, self).init()
self.encoder = nn.Sequential(
nn.Conv2d(1, 16, 3, stride=2, padding=1),
nn.ReLU(),
nn.Conv2d(16, 32, 3, stride=2, padding=1),
nn.ReLU(),
nn.Conv2d(32, 64, 3)
)
self.decoder = nn.Sequential(
nn.ConvTranspose2d(64, 32, 3),
nn.ReLU(),
nn.ConvTranspose2d(32, 16, 3, stride=2, padding=1, output_padding=1),
nn.ReLU(),
nn.ConvTranspose2d(16, 1, 3, stride=2, padding=1, output_padding=1),
nn.Sigmoid()
)

def forward(self, x):
    x = self.encoder(x)
    x = self.decoder(x)
    return x`


the error in loss_train = criterion(output_train, y_train.long())

RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of dimension: 1

ptrblck · May 19, 2021, 7:02am

Since you are passing the targets as LongTensors, I assume you are using nn.CrossentropyLoss.
Also based on the posted architecture it seems you are working on a multi-class segmentation use case.
If that’s the case, the model output is expected to contain logits (so remove the sigmoid) in the shape [batch_size, nb_classes, height, width], while the target should be a LongTensor in the shape [batch_size, height, width] containing the class indices in the range [0, nb_classes-1].

randinoo · May 19, 2021, 8:59am

Hi @ptrblck , it’s the same thing

class Autoencoder(nn.Module):
    def __init__(self):
        super(Autoencoder, self).__init__()
        self.encoder = nn.Sequential(
            nn.Conv2d(1, 16, 3, stride=2, padding=1),
            nn.ReLU(),
            nn.Conv2d(16, 32, 3, stride=2, padding=1),
            nn.ReLU(),
            nn.Conv2d(32, 64, 3)
        )
        self.decoder = nn.Sequential(
            nn.ConvTranspose2d(64, 32, 3),
            nn.ReLU(),
            nn.ConvTranspose2d(32, 16, 3, stride=2, padding=1, output_padding=1),
            nn.ReLU(),
            nn.ConvTranspose2d(16, 1, 3, stride=2, padding=1, output_padding=1),
            
        )

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return x

x_train, y_train = Variable(train_x), Variable(train_y)
   
    x_val, y_val = Variable(val_x), Variable(val_y)
  
    if torch.cuda.is_available():
      
        x_train = x_train.cuda()
        y_train = y_train.cuda()
        x_val = x_val.cuda()
        y_val = y_val.cuda()

  
    optimizer.zero_grad()

    
    output_train = model(x_train.float())
    output_val = model(x_val.float())
    

   
    loss_train = criterion(output_train, y_train.long())

help me plzz

pchandrasekaran · May 19, 2021, 7:27pm

Are x_train, y_train, x_val, y_val batched inputs or only a single sample? That may explain the shape/batch error.

Also, is this is an image autoencoder? If it is, then:

Normalize your inputs
No need to pass targets as long.
Use MSELoss() as criterion

randinoo · May 19, 2021, 8:28pm

X_train in input, and y_train is output

hwo normalize my input?

pchandrasekaran · May 19, 2021, 9:09pm

Here is a link to a beginner friendly Image Autoencoder template that I have written. The same concept can be applied to any image feature extracting autoencoder. Please feel free to PM if you have any doubts as to what the code does, I’ll be happy to help.

Regards,

randinoo · May 19, 2021, 11:12pm

RuntimeError: Calculated padded input size per channel: (1 x 1). Kernel size: (3 x 3). Kernel size can’t be greater than actual input size

pchandrasekaran · May 20, 2021, 5:07am

That would be because your inputs are too small. Since, that notebook is based on MNIST, the minimum input size needs to be 28x28.

randinoo · May 20, 2021, 10:19am

randinoo · May 21, 2021, 7:21am

hi @pchandrasekaran when I tested this code GAN,

class DiscriminatorNet(torch.nn.Module):
    """
    A three hidden-layer discriminative neural network
    """
    def __init__(self):
        super(DiscriminatorNet, self).__init__()
        n_features = 40
        n_out = 2
        
        self.hidden0 = nn.Sequential( 
            nn.Linear(n_features, 1024),
            nn.LeakyReLU(0.2),
            nn.Dropout(0.3)
        )
        self.hidden1 = nn.Sequential(
            nn.Linear(1024, 512),
            nn.LeakyReLU(0.2),
            nn.Dropout(0.3)
        )
        self.hidden2 = nn.Sequential(
            nn.Linear(512, 256),
            nn.LeakyReLU(0.2),
            nn.Dropout(0.3)
        )
        self.out = nn.Sequential(
            torch.nn.Linear(256, n_out),
            torch.nn.Sigmoid()
        )

    def forward(self, x):
        x = self.hidden0(x)
        x = self.hidden1(x)
        
        x = self.hidden2(x)
        x = self.out(x)
        return x

help me plzz

pchandrasekaran · May 21, 2021, 10:18am

Now, I’m making an assumption here as I have limited information; I’m assuming you want a single output in the range (0, 1) and are tackling a binary classification problem.

Change n_out to 1 and use nn.BCELoss(). [If n_out is 2, use nn.NLLLoss(), and some extra changes are needed, so leave it for now]
If n_out=1, you’ll need to binarize the output from the network in order to use with sklearn’s accuracy_score since a sigmoided output is going to be a float in the interval [0, 1]. You can do that by:

threshold = 0.5
network_output[network_output > threshold] = 1
network_output[network_output <= threshold] = 0

randinoo · May 21, 2021, 11:02am

@pchandrasekaran

randinoo · May 31, 2021, 9:29pm

@pchandrasekaran @ptrblck
RuntimeError: The size of tensor a (16) must match the size of tensor b (5177) at non-singleton dimension 3

with this code

class Autoencoder(nn.Module):
    def __init__(self):
        super(Autoencoder, self).__init__()
        self.encoder = nn.Sequential(
            nn.Conv2d(1, 16, 3, stride=2, padding=1),
            nn.ReLU(),
            nn.Conv2d(16, 32, 3, stride=2, padding=1),
            nn.ReLU(),
            nn.Conv2d(32, 64, 3)
        )
        self.decoder = nn.Sequential(
            nn.ConvTranspose2d(64, 32, 3),
            nn.ReLU(),
            nn.ConvTranspose2d(32, 16, 3, stride=2, padding=1, output_padding=1),
            nn.ReLU(),
            nn.ConvTranspose2d(16, 1, 3, stride=2, padding=1, output_padding=1),
            
        )

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return x

clearing the Gradients of the model parameters

optimizer.zero_grad()

# prediction for training and validation set

output_train = model(x_train.float())

output_val = model(x_val.float())


loss_train = criterion(output_train, y_train.long())

loss_val = criterion(output_val, y_val.long())

ptrblck · May 31, 2021, 10:48pm

Your model works correctly using random input shapes, so I guess the shape mismatch is caused in the loss calculation, in which case you would have to check the shapes of the model output and target tensor and make sure they have the expected shapes.
I don’t know which criterion you are using, but the docs explain the expected shapes for them.