Anyone implemented a sparse autoencoder with KL divergence regularizer?


I am trying to imple4ment a sparse autoencoder based on Andrew Ng’s lecture notes, Sparse Autoencoder by Andrew Ng. Instead of getting mostly small values and a few larger values, I am getting all small values. If anyone has done it before, perhaps you can point me to some pitfalls I might be under.

While on this topic, does anyone know how the KL sparcity compare to using L1 regularizer?

If you have implemented the sparse autoencoder can you share the code with us?