Activation function in output layer of autoencoders

Aritra · April 27, 2020, 4:36am

Do we need to use an activation function on the final decoding layer of an autoencoder?

acobobby · April 27, 2020, 8:22am

It depends on the loss function you are using. If the loss takes logits in input, then it most likely implements the appropriate nonlinearity and you can use just a linear layer as your decoder output.
If you use a custom loss, you may have to use an activation function.

Aritra · April 27, 2020, 9:57am

What if I’m using an autoencoder with MSE loss?
Should I use linear activation for the final decoding layer?

acobobby · April 27, 2020, 10:21am

Since you are using MSE, you are probably dealing with regression task (predicting real valued output features). Then, you can use just a linear layer as output layer.
The model will learn to produce values according to the targets you provide to it.

Just to be clear: if you were dealing with a classification task, in principle you should have used softmax activation function in order to restrict your output in a probability space and then pick the most probable one as predicted class. But since the softmax function is already implemented in the CrossEntropyLoss (used for classification tasks), you watned to use only a linear layer also in this case.

Aritra · April 27, 2020, 10:50am

Thanks for explaining