Hi, I observed that the default CNN initialization has been changed.
In version 1.0 and above:
def reset_parameters(self):
n = self.in_channels
init.kaiming_uniform_(self.weight, a=math.sqrt(5))
if self.bias is not None:
fan_in, _ = init._calculate_fan_in_and_fan_out(self.weight)
bound = 1 / math.sqrt(fan_in)
init.uniform_(self.bias, -bound, bound)
However, in 0.4 versions:
def reset_parameters(self):
n = self.in_channels
for k in self.kernel_size:
n *= k
stdv = 1. / math.sqrt(n)
self.weight.data.uniform_(-stdv, stdv)
if self.bias is not None:
self.bias.data.uniform_(-stdv, stdv)
What the motivation of these changes and how to understand
init.kaiming_uniform_(self.weight, a=math.sqrt(5))
Why a=sqrt(5)
Thanks for your time!