- I write the function for weight initialization, as follows:

def*initialize_weights(self):*(0, math.sqrt(2. / n))

for m in self.modules():

if isinstance(m, nn.Conv2d):

print(m)

n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels

m.weight.data.normal

if m.bias is not None:

m.bias.data.zero_()

elif isinstance(m, nn.BatchNorm2d):

print(m)

m.weight.data.fill_(1)

m.bias.data.zero_()

elif isinstance(m, nn.BatchNorm1d):

print(m)

m.weight.data.fill_(1)

m.bias.data.zero_()

obviously, the function will not init the nn.linear, but when I place this function as two methods, that is,

…

_initialize_weights()

self.fc = nn.linear(in_feature, out_feature)

or

…

self.fc = nn.linear(in_feature, out_feature)

_initialize_weights()

Intuitively, two methods should have the same effect, however, the result is still different, why?, please give some reasons if possible?Thank you!