Activation function in output layer of autoencoders

Do we need to use an activation function on the final decoding layer of an autoencoder?

It depends on the loss function you are using. If the loss takes logits in input, then it most likely implements the appropriate nonlinearity and you can use just a linear layer as your decoder output.
If you use a custom loss, you may have to use an activation function.

What if Iā€™m using an autoencoder with MSE loss?
Should I use linear activation for the final decoding layer?

Since you are using MSE, you are probably dealing with regression task (predicting real valued output features). Then, you can use just a linear layer as output layer.
The model will learn to produce values according to the targets you provide to it.

Just to be clear: if you were dealing with a classification task, in principle you should have used softmax activation function in order to restrict your output in a probability space and then pick the most probable one as predicted class. But since the softmax function is already implemented in the CrossEntropyLoss (used for classification tasks), you watned to use only a linear layer also in this case.

2 Likes

Thanks for explaining