Default weight initialisation for Conv layers (including SELU)

Nikronic · June 28, 2020, 5:34am

Hi, Sorry for my late answer, I am struggling with final exams!

First, thank you for your deep explanation. Secondly, I do not have strong mathematical background, so I do not think I am eligble to validate this but sounds fine to me.

Another point I would like to mention is that PyTorch uses uniform for initializing weights in convs and linear layers so if gain in PReLU is identical to LeakyReLU, then to achieve the range of [-1/sqrt(fan_mode), 1/sqrt(fan_mode)] for uniform distribution, still we need to consider negative_slope=sqrt(5) where otherwise it will lead to a different scenario.

I think we need to discuss this as a feature request so the main developers can help us with that. So, I think it would be great idea to create an issue on github.
Here is another issue related to this idea that may help. Furthermore, this thread considered another perspective which I have no clue what they are talking about .

If you created the issue on github, could you please also tag me so I can keep track of things? my username is Nikronic

Thank you