How does Pytorch initalises the weights and the biases by default?

Jeet · September 19, 2022, 8:14pm

I would like to know that which distribution does PyTorch follows now for the default initalisation of the weights and the biases? For example when I write something like -

self.neural=nn.Sequential(nn.Linear(in_features,4))

I need the mathematical formulation of the initalisation.

ptrblck · September 19, 2022, 9:49pm

The parameters are initialized in the reset_parameters method of the corresponding layer. For nn.Linear you can find the method here:

    def reset_parameters(self) -> None:
        # Setting a=sqrt(5) in kaiming_uniform is the same as initializing with
        # uniform(-1/sqrt(in_features), 1/sqrt(in_features)). For details, see
        # https://github.com/pytorch/pytorch/issues/57109
        init.kaiming_uniform_(self.weight, a=math.sqrt(5))
        if self.bias is not None:
            fan_in, _ = init._calculate_fan_in_and_fan_out(self.weight)
            bound = 1 / math.sqrt(fan_in) if fan_in > 0 else 0
            init.uniform_(self.bias, -bound, bound)