Using `model.fc.apply()` to initialize the fc layer with Xavier, `torchvision` model case

Deeply · April 6, 2019, 2:30pm

There has been quite a few posts on weight initialization, and there is some confusion.

My aim is to only initialized the fc weights and biases several times using model.apply() during training (I know this sounds kinky, but I am doing it for to investigate the training stochastics). Not sure, however, if this initialization should also be done to Batch Normalization. BTW, I am freezing the other layers after warming the model up for some epochs. Here’s the code I have in mind.

model = models.resnet18(pretrained=True)

From the available initialization methods, I want to use Xavier’s method. The correct solution is this one:

model.fc.weight = torch.nn.Parameter( torch.nn.init.xavier_uniform_(model.fc.weight))
model.fc.bias = torch.nn.Parameter( torch.nn.init.xavier_uniform_(model.fc.bias) )

Any comments, corrections of what I am doing?
Especially with regards to Batch normalization.

NB. I am getting this error for the bias:
ValueError: Fan in and fan out can not be computed for tensor with fewer than 2 dimensions

To get around this error, I had to double the bias tensor, as follows:

x = torch.nn.init.xavier_uniform_(model.fc.bias.repeat(2,1));
model.fc.bias = torch.nn.Parameter( x[0,:] )