Autoencoder: how to add sparsity

Hi everyone!

I am deploying a sparse autoencoder.

Briefly, an autoencoder is a feedforward NN that is formed by a series of layers of decreasing dimension (the encoder), followed by a series of layers of increasing dimension (the decoder). The loss is computed by the MSE between NN’s input and decoder’s output. At the end of the encoder, instead, there is the encoded input.

I want to limit the number of activated neurons at the encoder’s ouput layer, in order to “simplify” the encoded data. But I am not sure that the steps I have followed are the right ones.
I thought to count the number of times in which each encoder’s output neuron was activated over the entire training dataset, compare such values with the desidered ones (an hyperparameter) and so the sparsity loss.
Then start fitting the model batch by batch, epoch by epoch, adding each time the sparsity loss to the training one, before the backpropagation step.
The sparsity loss is updated after each epoch.

Following this procedure, I have not obtained much satisfactory results. It seems that the sparsity term has not effects on the training. So before start looking for code bugs, I would like to have a confirm or a denial about the followed high-level procedure.

Thank you!

1 Like

How did you calculate the sparsity loss?
Based on your description it seems as if you’ve implemented a “counting” approach, which might not be differentiable.
To verify it, you could check, if the sparsity loss has a valid .grad_fn.
If that’s not the case, your approach is currently just adding an offset to the loss, which won’t have an effect on the training.

In case your sparsity loss doesn’t have a .grad_fn, you could maybe try to add an L1 penalty to the parameters of the last layer of your encoder, which might make them sparse.

1 Like