Sanity check for finetuning inception_v3

ritchie46 · January 26, 2018, 12:53pm

I am trying to finetune an inception_v3 model and I notice that training is quit instable. I wan’t to check if I my setup is correct.

Preprocessing
All the images used to train the models in the torchvision model zoo are normalized with:

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

But because the inception_v3’s weights were copied from tensorflow, the normalized images need to undergo another transformation:

            x = x.clone()
            x[:, 0] = x[:, 0] * (0.229 / 0.5) + (0.485 - 0.5) / 0.5
            x[:, 1] = x[:, 1] * (0.224 / 0.5) + (0.456 - 0.5) / 0.5
            x[:, 2] = x[:, 2] * (0.225 / 0.5) + (0.406 - 0.5) / 0.5

model changes
The last layer in the model and the auxilery layer are replaced with another fully connected layer to match my output vector.

class Custom(Inception3):
    def __index__(self, num_classes=28, aux_logits=False, transform_input=True):
        Inception3.__init__(self, 1000, aux_logits, transform_input)
        self.load_state_dict(model_zoo.load_url(model_urls['inception_v3_google']))

        if aux_logits:
            self.AuxLogits = InceptionAux(768, num_classes)

        self.fc = nn.Linear(2048, num_classes)

Loss
During training the loss of the auxilery classifier and the final classifier are summed into one loss:

        y_pred_end, y_pred_middle = model(x)
        
        cw = class_weigths_idx(idx, Y)
        loss_1 = F.binary_cross_entropy(y_pred_end, y, cw)
        loss_2 = F.binary_cross_entropy(y_pred_middle, y, cw)
        loss = loss_1 + loss_2

Learning rate
I’ve set a small learning rate for the pre-trained layers and a higher learning rate for the new fully connected layer.

lr1 = 1e-7
lr2 = 1e-5
lr3 = 1e-3

optimizer = torch.optim.Adam([
        {"params": model.Conv2d_1a_3x3.parameters(), "lr":lr1},
        {"params": model.Conv2d_2a_3x3.parameters(), "lr":lr1},
        {"params": model.Conv2d_2b_3x3.parameters(), "lr":lr1},
        {"params": model.Conv2d_3b_1x1.parameters(), "lr":lr1},
        {"params": model.Conv2d_4a_3x3.parameters(), "lr":lr1},
        {"params": model.Mixed_5b.parameters(), "lr":lr2},
        {"params": model.Mixed_5c.parameters(), "lr":lr2},
        {"params": model.Mixed_5d.parameters(), "lr":lr2},
        {"params": model.Mixed_6a.parameters(), "lr":lr2},
        {"params": model.Mixed_6b.parameters(), "lr":lr2},
        {"params": model.Mixed_6c.parameters(), "lr":lr2},
        {"params": model.Mixed_6d.parameters(), "lr":lr2},
        {"params": model.Mixed_6e.parameters(), "lr":lr2},
        {"params": model.Mixed_7a.parameters(), "lr":lr2},
        {"params": model.Mixed_7b.parameters(), "lr":lr2},
        {"params": model.Mixed_7c.parameters(), "lr":lr2},
        {"params": model.fc.parameters(), "lr":lr3},
    ])

Is this a proper configuration for finetuning the inception network. I notice that my training is very instable and the loss seems to increase instead of decreasing. However, the learning rate seems to be quite low.