I am reading the docs for initializing the weights according to torch.nn.init.kaiming_normal_
and I have trouble to understand the following:
Why the default value for nonlinearity
is leaky_relu
while the default value for a
is 0.01
?
I am reading the docs for initializing the weights according to torch.nn.init.kaiming_normal_
and I have trouble to understand the following:
Why the default value for nonlinearity
is leaky_relu
while the default value for a
is 0.01
?
Hi,
It might be a bit confusing but the idea is that the default indeed a regular relu layer but we also want users to be able to change a
without having to change the nonlinearity flag. Hence the default there being leaky_relu.
@albanD Thanks for the reply! I had a typo in the OP. The default value for nonlinearity
is leaky_relu
but the default value for a
is 0
.