Unexpected sparse encodings

Hi! I trained an encoderdecoder with a simple CNN architecture, with the encoded / latent variable being 128x1x1 size. What I found surprising was, after the model converged to a low loss and can produce mostly meaningful reconstructions, when I feed any example (it has seen in training), the latent variable is very sparse. The 128x1x1 latent variable has about 96 zeros.
I trained it using MSE loss with no regularization.
Why would this be the case? Is there potentially any bug or is this natural?