I came across some problems when I tried to fine-tuning a VGG19 network on Image-net. I load the model’s parameters from “vgg19-dcbb9e9d.pth” by:
model.load_state_dict(torch.load(“./model_parameters/vgg19-dcbb9e9d.pth”))
And I have tested that at the beginning of the training procedure, (and after I loaded the model of course,) the parameters were successfully loaded (I printed the parameters to ensure that.) While the performance (cross entropy loss) was almost the same as the network initiated by normal distribution (around 6.9 or 7.0)! I am confused about why the model was loaded while failed to perform well.
My codes: 1) build the network(“vgg.vgg19” used later)
def vgg19(pretrained=False, **kwargs):
“”“VGG 19-layer model (configuration “E”)
Args:
pretrained (bool): If True, returns a model pre-trained on ImageNet
“””
model = VGG(make_layers(cfg[‘E’]), **kwargs)
if pretrained:
model.load_state_dict(torch.load(“./model_parameters/vgg19-dcbb9e9d.pth”))
return model
- the training codes:
fine_tuning = True net = vgg.vgg19(pretrained=fine_tuning) net.cuda() criterion = nn.CrossEntropyLoss() optimizer = optim.Adam(net.parameters(), lr=0.001) for epoch in range(10): # loop over the dataset multiple times running_loss = 0.0 adjust_learning_rate(optimizer, epoch) for i, data in enumerate(dataloader_train, 0): # get the inputs inputs, labels = data["image"], data["label"] # wrap them in Variable inputs, labels = Variable(inputs.cuda()), Variable(labels.cuda()) labels=labels.view(-1) # zero the parameter gradients optimizer.zero_grad() # forward + backward + optimize outputs = net.forward(inputs) loss = criterion(outputs, labels) loss.backward() # change the gradient flow here optimizer.step() # print statistics print(loss.data[0]) running_loss += loss.data[0] if i % 200 == 199: # print every 2000 mini-batches print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 200)) running_loss = 0.0
Thank you for reading my question! And I would sincerely appreciate your help and inspiration!
thank~