Self Normalizing Neural Networks (SNNs)

Hey guys,

I read the SNN paper ( [] ), and looked at the repo (its written in tensorflow) and specifically at this notebook Convolutional SNN. The paper states that the SNN property was proven for feed-forward-NNs (FNNs) using the fixpoint theorem. After seeing the implementation in the repo I was hoping this could work out for my CNNs. Sadly, there are too many open questions (and trying them out didn’t give me more stable and faster convergence rates than training from scratch with other init schemes):

  1. To preserve SNN property bias of incoming layers have to be equal to 0?
  2. When to use SELUs? Very low layers only?
  3. Weight init looks like this:
def init_snn(m):
    if isinstance(m, nn.Conv2d):
        n = m.in_channels * m.kernel_size[0] * m.kernel_size[1], math.sqrt(1. /n))
        if hasattr(m.bias, 'data'):


If someone’s hot for experiments: try building whitening filters for decorrelation of variables. This could lead to better convergence.