Questions about load pretrained model

cxy94 · January 8, 2019, 5:18am

Hi guys,
I meet a problems when loading the pretrained vgg model parameters provided by torchvision.
In the vgg16 provided by torchvision, they warp all the conv layers in the self.features, and all the FC layers in the self.classifier.
How ever, I didn’t warp the conv layers like torchvision.
I warp all the conv layers in 5 stages called stage1,2,3,4,5.
For example, the first conv layer in vgg16 ‘Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))’, in torchvision_vgg, it called vgg.features[0]. But in my net it called vgg.stage1[0].
So the .load_state_dict won’t work, because the name of the layers aren’t same.
What should I do to load the pretrained parameters.

cxy94 · January 8, 2019, 6:06am

Thanks to Alpha’s Topic, I got an answer.

vgg_torch = torchvision.models.vgg16(pretrained = True)

vgg_my = models.vgg16(pretrained = False)
vgg_my_dict = vgg_my.state_dict()

for k,v in vgg_torch.state_dict().items()[:26]:
    if 0<=int(k.split('.')[1])<=2:
        my_key = 'stage1.'+str(k.split('.')[1])+'.'+str(k.split('.')[2])
        vgg_my_dict[my_key] = v
    if 5<=int(k.split('.')[1])<=7:
        my_key = 'stage2.'+str(int(k.split('.')[1])-5)+'.'+str(k.split('.')[2])
        vgg_my_dict[my_key] = v
    if 10<=int(k.split('.')[1])<=14:
        my_key = 'stage3.'+str(int(k.split('.')[1])-10)+'.'+str(k.split('.')[2])
        vgg_my_dict[my_key] = v
    if 17<=int(k.split('.')[1])<=21:
        my_key = 'stage4.'+str(int(k.split('.')[1])-17)+'.'+str(k.split('.')[2])
        vgg_my_dict[my_key] = v
    if 24<=int(k.split('.')[1])<=28:
        my_key = 'stage5.'+str(int(k.split('.')[1])-24)+'.'+str(k.split('.')[2])
        vgg_my_dict[my_key] = v
    else :
        continue

vgg_my.load_state_dict(vgg_my_dict)

cxy94 · January 8, 2019, 6:18am

I use a straight-forward way to convert the key from vgg_torch.state_dict to the key in my_vgg at the same layer. You can choose a better way if you want.