Xavier initialization on a pretrained network

Dr_John · July 5, 2018, 8:35am

I want to use the VGG16 pretrained network to classify a dataset of 1000 images into two classes. I also want the weights in the classifier part of the network to be initialized by Xavier initialization. How can I access the weights of the classifier part?

I have tried the following lines of code, but “model.classifier.parameters()” doesn’t seem to be the correct call to the weights of the classifier part.

    model = models.vgg16(pretrained=True)
    num_features = model.classifier[6].in_features
    features = list(model.classifier.children())[:-1] 
    features.extend([nn.Linear(num_features, 2)])  
    model.classifier = nn.Sequential(*features)  
    nn.init.xavier_normal(model.classifier.parameters())

Thanks!

tom · July 5, 2018, 8:46am

for p in model.classifier.parameters()
  nn.init.xavier_normal_(p)

(without the _ for pre 0.4).
Note that it might be an alternative to explicitly target weights, but not bias (and init this by 0 or something.)
Here nn.init.xavier_normal_(features[-1].weight)

Best regards

Thomas

Dr_John · July 5, 2018, 8:57am

Thanks a lot Tom, I am not sure I understand the last part of your answer - the lines you suggested will initialize the weights in the classifier part according to Xavier, but not the bias part, which is not stored as part of model.classifier.parameters() ? The bias is stored in model.features[-1].weight?
When using Xavier, it is common to init the bias to be 0? Or also by Xavier?

Hope you will be able to help me understand this part. Thanks a lot!

tom · July 5, 2018, 8:35pm

No, the bias is stored in .bias, but it is part fo the parameters. My understanding is that xavier is more commonly applied to weight than to bias, but I don’t have a ton of references to back up that claim.

Best regards

Thomas