hi,
I was wondering what is default initialization of weights in libtorch for say a convolutional layer? Is it just random?
thanks,
Sushmita
The parameters are initialized via the reset_parameters()
function in libtorch
and in the Python frontend. You could check the function for the corresponding layer, e.g. reset_parameters()
in conv.h to see the actual initialization.
Thank you! I see so it is the kaiming_uniform. I looked at the source code in init.cpp and it uses the calculate_kaiming_std function which seems to be consistent with the pytorch documentation here: torch.nn.init — PyTorch 2.3 documentation
The libtorch documentation was quite barebones, so I assume the above is what it is.
I would like to try a custom initialization since I feel my init parameters are not optimal (too big). I see a “hockey stick” profile in the loss where there is a big jump in the first iteration and then small bumps here and there.
Would you be able to provide some code to provide custom initialization of the weights?
thanks,
Sushmita