Gain value of kaiming_initialization

07hyx06 · September 8, 2019, 1:53pm

I know that the default initialization of layers are torch.nn.init.kaiming_uniform(tensor,a=sqrt(5))
where a is the gain value of nonlinearity.

In my VGG, all my nonlinearity is ReLU. So according to the paper of kaiming_initialization, i should set a=0. When i use that initialization, loss fly to NAN.

But when i use the default initialization, i trained my net successfully.

What’s the problem with that?Why can pytorch set default gain equal to sqrt(5)?

ptrblck · September 9, 2019, 12:53pm

Have a look at this answer.

07hyx06 · September 10, 2019, 11:49am

Thanks for solving my problem:laughing: