Successfully-loaded pre-trained model perform awful

ChengzheXu · May 28, 2018, 11:30am

I came across some problems when I tried to fine-tuning a VGG19 network on Image-net. I load the model’s parameters from “vgg19-dcbb9e9d.pth” by:

model.load_state_dict(torch.load(“./model_parameters/vgg19-dcbb9e9d.pth”))

And I have tested that at the beginning of the training procedure, (and after I loaded the model of course,) the parameters were successfully loaded (I printed the parameters to ensure that.) While the performance (cross entropy loss) was almost the same as the network initiated by normal distribution (around 6.9 or 7.0)! I am confused about why the model was loaded while failed to perform well.

My codes: 1) build the network(“vgg.vgg19” used later)

def vgg19(pretrained=False, **kwargs):
“”“VGG 19-layer model (configuration “E”)
Args:
pretrained (bool): If True, returns a model pre-trained on ImageNet
“””
model = VGG(make_layers(cfg[‘E’]), **kwargs)
if pretrained:
model.load_state_dict(torch.load(“./model_parameters/vgg19-dcbb9e9d.pth”))
return model

the training codes:

fine_tuning = True

net = vgg.vgg19(pretrained=fine_tuning)
net.cuda()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=0.001)
for epoch in range(10):  # loop over the dataset multiple times
    running_loss = 0.0
    adjust_learning_rate(optimizer, epoch)
    for i, data in enumerate(dataloader_train, 0):
        # get the inputs
        inputs, labels = data["image"], data["label"]
        # wrap them in Variable
        inputs, labels = Variable(inputs.cuda()), Variable(labels.cuda())
        labels=labels.view(-1)
        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net.forward(inputs)
        loss = criterion(outputs, labels)
        loss.backward()

        # change the gradient flow here

        optimizer.step()

        # print statistics
        print(loss.data[0])
        running_loss += loss.data[0]
        if i % 200 == 199:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 200))
            running_loss = 0.0

Thank you for reading my question! And I would sincerely appreciate your help and inspiration!
thank~

ptrblck · May 28, 2018, 11:34am

How did you load your data?
It’s crucial to apply the same pre-processing on the image data as was done while pre-training the model.
Also, the image size should be usually the same, but I assume you are using the same data.

ChengzheXu · May 28, 2018, 11:53am

Oh, Thank you~ Sorry that I forget to mention about that. I think I had used the same pre-processing on the image data, including: 1) change BGR to RGB, 2) rescale the image by 255, 3) resize the image, 4) change the dimension order of the image (from “heightwidthchannal” to “chanalheightwidth”), 5) normalize the image (mean = [0.485, 0.456, 0.406], and std = [0.229, 0.224, 0.225]).
And by the way, yes I use the same data, the model was trained on ImageNet and I also use ImageNet to fine-tuning the net

ptrblck · May 29, 2018, 12:06pm

Looks fine to me!
Just wondering why did you need to convert BGR to RGB? Did you load the images with OpenCV or is there another reason?

Back to the original issue:
You are loading the pre-trained VGG model and would like to fine-tune it on the ImageNet dataset?
Once it’s loaded, could you set it to eval and calculate the accuracy on the test data before any fine-tuning?
The error might also be related to a large learning rate. Since your model is already in a good state, a learning rate of 1e-3 might catapult your weights in a bad zone.

roaffix · May 29, 2018, 1:21pm

Yeah, due to a @ptrblck suggestion I think your code should looks like:

def vgg19(pretrained=False, **kwargs):
    """VGG 19-layer model (configuration “E”)
    Args:
    pretrained (bool): If True, returns a model pre-trained on ImageNet
    """
    model = VGG(make_layers(cfg[‘E’]), **kwargs)
    if pretrained:
        model.load_state_dict(torch.load("./model_parameters/vgg19-dcbb9e9d.pth"))
        model.eval()
    return model

ChengzheXu · June 9, 2018, 4:22am

Thank you for your reply and sorry for late, I had a series of examinations weeks before.

Oh, by the way, I convert BGR to RGB because I use OpenCV (cv2) to read the image. And I think your advice really worth trying! I would do it at once and see what would happen. Thx~

ChengzheXu · June 9, 2018, 4:24am

Oh thx~ I would try it and see what would happen