Error in Autoencoder

``class Autoencoder(nn.Module):
def init(self):
super(Autoencoder, self).init()
self.encoder = nn.Sequential(
nn.Conv2d(1, 16, 3, stride=2, padding=1),
nn.ReLU(),
nn.Conv2d(16, 32, 3, stride=2, padding=1),
nn.ReLU(),
nn.Conv2d(32, 64, 3)
)
self.decoder = nn.Sequential(
nn.ConvTranspose2d(64, 32, 3),
nn.ReLU(),
nn.ConvTranspose2d(32, 16, 3, stride=2, padding=1, output_padding=1),
nn.ReLU(),
nn.ConvTranspose2d(16, 1, 3, stride=2, padding=1, output_padding=1),
nn.Sigmoid()
)

def forward(self, x):
    x = self.encoder(x)
    x = self.decoder(x)
    return x`

the error in loss_train = criterion(output_train, y_train.long())

RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of dimension: 1

Since you are passing the targets as LongTensors, I assume you are using nn.CrossentropyLoss.
Also based on the posted architecture it seems you are working on a multi-class segmentation use case.
If that’s the case, the model output is expected to contain logits (so remove the sigmoid) in the shape [batch_size, nb_classes, height, width], while the target should be a LongTensor in the shape [batch_size, height, width] containing the class indices in the range [0, nb_classes-1].

Hi @ptrblck , it’s the same thing

class Autoencoder(nn.Module):
    def __init__(self):
        super(Autoencoder, self).__init__()
        self.encoder = nn.Sequential(
            nn.Conv2d(1, 16, 3, stride=2, padding=1),
            nn.ReLU(),
            nn.Conv2d(16, 32, 3, stride=2, padding=1),
            nn.ReLU(),
            nn.Conv2d(32, 64, 3)
        )
        self.decoder = nn.Sequential(
            nn.ConvTranspose2d(64, 32, 3),
            nn.ReLU(),
            nn.ConvTranspose2d(32, 16, 3, stride=2, padding=1, output_padding=1),
            nn.ReLU(),
            nn.ConvTranspose2d(16, 1, 3, stride=2, padding=1, output_padding=1),
            
        )

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return x

x_train, y_train = Variable(train_x), Variable(train_y)
   
    x_val, y_val = Variable(val_x), Variable(val_y)
  
    if torch.cuda.is_available():
      
        x_train = x_train.cuda()
        y_train = y_train.cuda()
        x_val = x_val.cuda()
        y_val = y_val.cuda()

  
    optimizer.zero_grad()

    
    output_train = model(x_train.float())
    output_val = model(x_val.float())
    

   
    loss_train = criterion(output_train, y_train.long())

help me plzz :pensive:

Are x_train, y_train, x_val, y_val batched inputs or only a single sample? That may explain the shape/batch error.

Also, is this is an image autoencoder? If it is, then:

  1. Normalize your inputs
  2. No need to pass targets as long.
  3. Use MSELoss() as criterion

X_train in input, and y_train is output

hwo normalize my input?

Here is a link to a beginner friendly Image Autoencoder template that I have written. The same concept can be applied to any image feature extracting autoencoder. Please feel free to PM if you have any doubts as to what the code does, I’ll be happy to help.

Regards,

RuntimeError: Calculated padded input size per channel: (1 x 1). Kernel size: (3 x 3). Kernel size can’t be greater than actual input size

That would be because your inputs are too small. Since, that notebook is based on MNIST, the minimum input size needs to be 28x28.

hi @pchandrasekaran when I tested this code GAN,

class DiscriminatorNet(torch.nn.Module):
    """
    A three hidden-layer discriminative neural network
    """
    def __init__(self):
        super(DiscriminatorNet, self).__init__()
        n_features = 40
        n_out = 2
        
        self.hidden0 = nn.Sequential( 
            nn.Linear(n_features, 1024),
            nn.LeakyReLU(0.2),
            nn.Dropout(0.3)
        )
        self.hidden1 = nn.Sequential(
            nn.Linear(1024, 512),
            nn.LeakyReLU(0.2),
            nn.Dropout(0.3)
        )
        self.hidden2 = nn.Sequential(
            nn.Linear(512, 256),
            nn.LeakyReLU(0.2),
            nn.Dropout(0.3)
        )
        self.out = nn.Sequential(
            torch.nn.Linear(256, n_out),
            torch.nn.Sigmoid()
        )

    def forward(self, x):
        x = self.hidden0(x)
        x = self.hidden1(x)
        
        x = self.hidden2(x)
        x = self.out(x)
        return x

help me plzz

Now, I’m making an assumption here as I have limited information; I’m assuming you want a single output in the range (0, 1) and are tackling a binary classification problem.

  1. Change n_out to 1 and use nn.BCELoss(). [If n_out is 2, use nn.NLLLoss(), and some extra changes are needed, so leave it for now]
  2. If n_out=1, you’ll need to binarize the output from the network in order to use with sklearn’s accuracy_score since a sigmoided output is going to be a float in the interval [0, 1]. You can do that by:
threshold = 0.5
network_output[network_output > threshold] = 1
network_output[network_output <= threshold] = 0

@pchandrasekaran
image

@pchandrasekaran @ptrblck
RuntimeError: The size of tensor a (16) must match the size of tensor b (5177) at non-singleton dimension 3

with this code

class Autoencoder(nn.Module):
    def __init__(self):
        super(Autoencoder, self).__init__()
        self.encoder = nn.Sequential(
            nn.Conv2d(1, 16, 3, stride=2, padding=1),
            nn.ReLU(),
            nn.Conv2d(16, 32, 3, stride=2, padding=1),
            nn.ReLU(),
            nn.Conv2d(32, 64, 3)
        )
        self.decoder = nn.Sequential(
            nn.ConvTranspose2d(64, 32, 3),
            nn.ReLU(),
            nn.ConvTranspose2d(32, 16, 3, stride=2, padding=1, output_padding=1),
            nn.ReLU(),
            nn.ConvTranspose2d(16, 1, 3, stride=2, padding=1, output_padding=1),
            
        )

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return x

clearing the Gradients of the model parameters

optimizer.zero_grad()

# prediction for training and validation set

output_train = model(x_train.float())

output_val = model(x_val.float())


loss_train = criterion(output_train, y_train.long())

loss_val = criterion(output_val, y_val.long())

Your model works correctly using random input shapes, so I guess the shape mismatch is caused in the loss calculation, in which case you would have to check the shapes of the model output and target tensor and make sure they have the expected shapes.
I don’t know which criterion you are using, but the docs explain the expected shapes for them.