There has been quite a few posts on weight initialization, and there is some confusion.
My aim is to only initialized the fc weights
and biases
several times using model.apply()
during training (I know this sounds kinky, but I am doing it for to investigate the training stochastics). Not sure, however, if this initialization should also be done to Batch Normalization. BTW, I am freezing the other layers after warming the model
up for some epochs. Here’s the code I have in mind.
model = models.resnet18(pretrained=True)
From the available initialization methods, I want to use Xavier’s method. The correct solution is this one:
model.fc.weight = torch.nn.Parameter( torch.nn.init.xavier_uniform_(model.fc.weight))
model.fc.bias = torch.nn.Parameter( torch.nn.init.xavier_uniform_(model.fc.bias) )
Any comments, corrections of what I am doing?
Especially with regards to Batch normalization.
NB. I am getting this error for the bias:
ValueError: Fan in and fan out can not be computed for tensor with fewer than 2 dimensions
To get around this error, I had to double the bias tensor
, as follows:
x = torch.nn.init.xavier_uniform_(model.fc.bias.repeat(2,1));
model.fc.bias = torch.nn.Parameter( x[0,:] )