Autoencoder for classification

Berny · June 27, 2022, 10:22am

Hi. I’m implementing an autoencoder and a FC network in parallel to make predictions on a dataset.
The following is a piece of code I used:

for epoch in range(epochs):
loss = 0
if name == ‘main’:
for (batch_features, _) in train_loader:
# reshape mini-batch data to [N, 2105] matrix
# load it to the active device
batch_features = batch_features.view(-1, 2105).to(device)
#batch_features=torch.tensor(batch_features)

        # reset the gradients back to zero
        # PyTorch accumulates gradients on subsequent backward passes
        optimizer.zero_grad()
    
        # compute reconstructions
        outputs = model(batch_features)

        # compute training reconstruction loss
        train_loss = criterion(outputs, batch_features)
    
        # compute accumulated gradients
        train_loss.backward()
    
        # perform parameter update based on current gradients
        optimizer.step()
    
        # add the mini-batch training loss to epoch loss
        loss += train_loss.item()

        # compute the epoch training loss
        loss = loss / len(train_loader)

        # display the epoch training loss
        print("epoch : {}/{}, loss = {:.6f}".format(epoch + 1, epochs, loss))

I get the following error corresponding to the line train_loss = criterion(outputs, batch_features): ‘tuple’ object has no attribute ‘size’ .
Could someone help me to fix it?

Andrei_Cristea · June 27, 2022, 1:06pm

Can you print this out right before the problematic line:

print(type(outputs))

If that’s a tuple, you need to make it into a Tensor before passing to criterion.

Berny · June 27, 2022, 2:38pm

Thank you for your answer Andrei. Yes, it seems to be a tuple, so I tried with: outputs=torch.Tensor(outputs)
but have the error ‘only one element tensors can be converted to Python scalars’. Maybe, is there a more appropriate way to transform it to a Tensor?

Andrei_Cristea · June 27, 2022, 2:57pm

Sounds like outputs is a tuple of Tensors. I can’t give a generic correct answer without knowing what exactly your model outputs, since that’s a function of your model design.

Can you share the forward function of your model so we can see what the output consists of?

Mechanically something like this might work, but we should be first sure that we understand what outputs really is before hammering it in:

outputs = torch.stack(outputs)

Berny · June 27, 2022, 3:25pm

The forward function is:

def forward(self, input):
activation = self.encoder_hidden_layer(input)
activation = torch.relu(activation)
code = self.encoder_output_layer(activation)
code = torch.relu(code)
x = self.decoder_hidden_layer(code)
x = torch.relu(x)
x = self.decoder_output_layer(x)
x = torch.relu(x)
y = F.relu(self.fc1(code))
y = F.relu(self.fc2(y))
y = torch.sigmoid(self.fc3(y))
return x,y

x represents the reconstructed input data, while y should be the predicted output (to perform in parallel with x).

Andrei_Cristea · June 27, 2022, 3:31pm

Ok so in that case I think you want to take only the first element for the loss function, which is the x inside your forward function, since that’s what should be the reconstructed input data:

train_loss = criterion(outputs[0], batch_features)

By the way you can use 3 back quotes around your code to format it in a more readable way.

Berny · June 27, 2022, 4:19pm

Now the code works, thank you very much!

Berny · June 29, 2022, 9:18am

Hello, now I want to make predictions (y) and use a softmax function (instead of sigmoid) as last layer such as:

‘’‘y = F.softmax(self.fc3(y),dim=1)’‘’

It returns 6 probability values. How to consider also the second element (y) for the loss function and combine it with the first one? I tried with:

‘’‘train_loss1 = criterion(outputs[0], batch_features)’‘’
‘’‘train_loss2 = criterion(outputs[1], labels )’‘’
‘’‘train_loss = train_loss1 + train_loss2’‘’

But the problem is that labels is 961 dimension and outputs[1] only 6, how to fix it?

Andrei_Cristea · June 29, 2022, 9:35am

Hi, it should return a tensor shaped like n_batch x n_label, where the sum of the values within each batch is 1. Labels should be shaped like n_batch. If you’re using something like CrossEntropyLoss, that should work. However, please note that CrossEntropyLoss does the Softmax for you, all you need to return is the raw values without the Softmax:
y = self.fc3(y)

Also, you may want to use a different loss function (criterion) for the reconstruction loss and the classification loss.

Berny · June 29, 2022, 4:27pm

Yes sounds good, I’ll try. Superthanks!