Default Weight Initialization vs Xavier Initialization

Hi, the question is very basic. PyTorch uses default weight initialization method as discussed here, but it also provides a way to initialize weights using Xavier equation. In many places 1, 2 the default method is also referred as Xavier’s. Can anyone explain where I am going wrong? Any help is much appreciated

The post is from January 2018 and outdated by now.
You can find the current weight init here, which is init.kaiming_uniform_(self.weight, a=math.sqrt(5)) for the weights.

1 Like

Great! Thanks:) One small clarification, is the method mentioned here is actually Xavier Initialization?

This doesn’t seem to be a xavier init, as only fan_in is used, while xavier_uniform_ uses a different scaling factor and the sum of the number of input and output features.

1 Like

Thanks :slight_smile: