Default Weight Initialization vs Xavier Initialization

Vigneswaran · July 16, 2019, 3:38am

Hi, the question is very basic. PyTorch uses default weight initialization method as discussed here, but it also provides a way to initialize weights using Xavier equation. In many places 1, 2 the default method is also referred as Xavier’s. Can anyone explain where I am going wrong? Any help is much appreciated

ptrblck · July 16, 2019, 10:01am

The post is from January 2018 and outdated by now.
You can find the current weight init here, which is init.kaiming_uniform_(self.weight, a=math.sqrt(5)) for the weights.

Vigneswaran · July 17, 2019, 10:12am

Great! Thanks:) One small clarification, is the method mentioned here is actually Xavier Initialization?

ptrblck · July 17, 2019, 10:17am

This doesn’t seem to be a xavier init, as only fan_in is used, while xavier_uniform_ uses a different scaling factor and the sum of the number of input and output features.

Vigneswaran · July 17, 2019, 10:23am

Thanks