Sparsity constraint at the output layer of an Autoencoder, make always the same neuron to be active

Hi everyone!

I am trying to implement an Autoencoder, whose layout is the following:

  • Input layer : 100 neurons
  • Hidden layer 1 : 40 neurons
  • Hidden layer 2 : 20 neurons
  • Hidden layer 3 == encoder output layer : 4 neurons
  • Hidden layer 4 : 20 neurons
  • Hidden layer 5 : 40 neurons
  • Output layer : 100 neurons

I use the MSE loss function; ReLu activation functions; Adam optimizer with weight decay.

I want (actually I need) to deploy also a sparsity constraint in the hidden layer 3. That means that I would like to have just one activated encoder output neuron for each input sample.
In order to do so, I have implementing a KL additive los.
This is the code:

# kl_divergence() takes two input parameters, rho and rho_hat. 
# rho is the sparsity parameter value (RHO) that we have initialized earlier.
# rho_hat is the output after going through the model layer-wise until hidden layer 3
# The first calculation for rho_hat happens layer-wise in sparse_loss(). Then we pass the values as inputs to kl_divergence()

def kl_divergence(rho, rho_hat):
    # rho_hat has dimension (batch_size)x(4), 4 being the number of output neurons

    # we count the number of activated neurons (remember: we have used ReLu)
    for i in range(rho_hat.shape[0]):
        for j in range(rho_hat.shape[1]):
            if rho_hat[i,j]>0:
                rho_hat[i,j]= 1
                rho_hat[i,j]= 0
    # compute the average activation frequency of each neuron.
    rho_hat = ((rho_hat.sum(dim=0))/rho_hat.shape[0])
    # Now rho_hat has 4 values, each of which is the average activation frequency of the corresponding neuron
    # we want to avoid the logarithm being infinity
    for i in range(len(rho_hat)): # len(rho_hat)==4
        if rho_hat[i]==0:
        elif rho_hat[i]==1:

    rho = torch.tensor([rho] * len(rho_hat)).to(device)
    # we return the KL divergence between rho and rho_hat.
    return torch.sum(rho * torch.log(rho/rho_hat) + (1 - rho) * torch.log((1 - rho)/(1 - rho_hat))).item()
# define the sparse loss function
def sparse_loss(rho, x, feat_name):
    values = x  # input data
    loss = 0
    for i in range(len(model_children)):
        values = model_children[i](values)
        values = F.relu(values) 
        if i==2:
            loss += kl_divergence(rho, values)
    return loss

So, for each batch, I count the number of times in which the neuron j is activated. Then I compute the average and the corresponding loss. RHO is the desired frequency of activation.
An alternative could be to compute the loss along the entire dataset, then start the traininf phase, adding to the loss of each batch the computed sparsity penalty. I computed it, but the performance was, again, not satisfactory.

Implementing this, with RHO=0.25, it turns out that always the same neuron is active. 3 neurons out of 4 output always 0. This seems to me a strong limitation, obviously.

I would like to know if I am missing something, or if there is a logical bug in my code.

Edit: Maybe also a NN’s parameters other than the default one could help, but I am not sufficiently prepared oh this topic. So if you have any suggestion is welcomed.