Who modify my weight initialization?

I know it is the reset_parameters() in conv.py that is responsible for the default weight initialization,

I changed the function to

 def reset_parameters(self):
        n = self.in_channels
        for k in self.kernel_size:
            n *= k
        stdv = 1. / math.sqrt(n)
        print('reset w, stdv=',stdv)
        self.weight.data.uniform_(-stdv, stdv)
        if self.bias is not None:
            print('reset b, stdv=',stdv)
            self.bias.data.uniform_(-stdv, stdv)
        print('w:',self.weight.data.norm(), 'b:',self.bias.data.norm())

and after my model was created, I apply the follow manual weights_init() to change the weight initialization method of my conv layer

def weights_init(m):
    classname = m.__class__.__name__
    if classname.find('Conv') != -1:
        print(m)
        print(m.weight.data.norm())
        print(m.bias.data.norm())
        
        std_w = m.weight.size(1) * m.weight.size(2) * m.weight.size(3)
        std_b = m.weight.size(0)
        std_w = 1. / math.sqrt(std_w)
        # std_b = 1. / math.sqrt(std_b)
        std_b = std_w
        m.weight.data.uniform_(-std_w, std_w)
        m.bias.data.uniform_(-std_b, std_b)
        print(m.weight.data.norm())
        print(m.bias.data.norm())
        print('\n\n')

I think they are same , but I found it print below info in reset_params function

reset w, stdv= 0.19245008972987526
reset b, stdv= 0.19245008972987526
w: 4.651364750286455 b: 0.9658572124243059
reset w, stdv= 0.041666666666666664
reset b, stdv= 0.041666666666666664
w: 4.60668514079571 b: 0.19021859795685142
reset w, stdv= 0.041666666666666664
reset b, stdv= 0.041666666666666664
w: 6.529658534196003 b: 0.24801288097906313
reset w, stdv= 0.029462782549439483
reset b, stdv= 0.029462782549439483
w: 6.544403663970284 b: 0.20246035190569983
reset w, stdv= 0.029462782549439483
reset b, stdv= 0.029462782549439483
w: 9.237618805061214 b: 0.2699324704165474
reset w, stdv= 0.020833333333333332
reset b, stdv= 0.020833333333333332
w: 9.240560902888104 b: 0.18776950085546212
reset w, stdv= 0.020833333333333332
reset b, stdv= 0.020833333333333332
w: 9.23323252375467 b: 0.19598698034213305
reset w, stdv= 0.020833333333333332
reset b, stdv= 0.020833333333333332
w: 9.247914516750834 b: 0.19991090497324737
reset w, stdv= 0.020833333333333332
reset b, stdv= 0.020833333333333332
w: 13.062441360447233 b: 0.2709608856088436
reset w, stdv= 0.014731391274719742
reset b, stdv= 0.014731391274719742
w: 13.058955297303523 b: 0.19297756771652977
reset w, stdv= 0.014731391274719742
reset b, stdv= 0.014731391274719742
w: 13.064573213326009 b: 0.19342352625500445
reset w, stdv= 0.014731391274719742
reset b, stdv= 0.014731391274719742
w: 13.060771314609305 b: 0.1931597201764238
reset w, stdv= 0.014731391274719742
reset b, stdv= 0.014731391274719742
w: 13.068217941106957 b: 0.1944194648771781
reset w, stdv= 0.014731391274719742
reset b, stdv= 0.014731391274719742
w: 13.064472494773318 b: 0.1871614021517605
reset w, stdv= 0.014731391274719742
reset b, stdv= 0.014731391274719742
w: 13.065174600640301 b: 0.19473828458164352
reset w, stdv= 0.014731391274719742
reset b, stdv= 0.014731391274719742
w: 13.064107007871193 b: 0.18669129860732317

but during my weights_init() ,it print below info:

Conv2d (3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
2.5413928031921387
0.0
4.599651336669922
0.8222851753234863

Conv2d (64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
11.280376434326172
0.0
4.624712944030762
0.18499651551246643

Conv2d (64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
11.323299407958984
0.0
6.527068614959717
0.2626609206199646

Conv2d (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
16.010761260986328
0.0
6.516138553619385
0.18262024223804474

Conv2d (128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
16.00145149230957
0.0
9.22119426727295
0.2743944823741913
etc …

Obviously, it has been changed somewhere, for example, the bias was changed to 0 at least.
I don’t know if there another point in the source code which modify the conv weight initialization, who can explain this question to me will be much appreciated! My English is poor, hoping you can understand it :smile:

I will waiting for your reply online, I am really confused by this question for a few days, thanks a lot!
Hoping for your insights about it. Any suggestion will be appreciated!

I know something now, it has been changed to kaiming_normal(mode=“fan_out”) for conv’s weight and zero for conv’s bias , but for linear layer the weight initialization method remained.

I tried it by hand one by one using the method implemented in nn.init.py