Weird behaviors when setting pre-trained weights on CNN networks

sjyk112 · September 15, 2021, 7:21am

Hi.

I’m training Mini-ImageNet datasets using VGG16 with pre-trained ImageNet parameters.
First, load_state_dict() method was used to set pre-trained parameters in network.
At this time, accuracy scored 55% in 1st epoch and it trained continuously.

Secondly, i set pre-trained parameters in network by assign weight tensor in module’s weight and bias attribute directly, like the code below. The ‘tensor’ in code block is just example.

for m in model.features.modules():

        if isinstance(m, nn.Conv2d) :
            with torch.no_grad():
                m.weight = nn.Parameter(tensor)
                m.bias = nn.Parameter(tensor)
        
        elif isinstance(m, nn.BatchNorm2d) :
            with torch.no_grad():
                m.weight = nn.Parameter(tensor)
                m.bias = nn.Parameter(tensor)
                m.running_mean = tensor
                m.running_var = tensor

At this time, 85% was scored in 1st epoch.

I checked all the assigned weights an biases are same each other between two method, after setting pre-trained parameters.

I don’t understand this behavior, and I wonder if there is any difference between the two methods.

Thanks for your attention.