I’m training Mini-ImageNet datasets using VGG16 with pre-trained ImageNet parameters.
First, load_state_dict() method was used to set pre-trained parameters in network.
At this time, accuracy scored 55% in 1st epoch and it trained continuously.
Secondly, i set pre-trained parameters in network by assign weight tensor in module’s weight and bias attribute directly, like the code below. The ‘tensor’ in code block is just example.
for m in model.features.modules(): if isinstance(m, nn.Conv2d) : with torch.no_grad(): m.weight = nn.Parameter(tensor) m.bias = nn.Parameter(tensor) elif isinstance(m, nn.BatchNorm2d) : with torch.no_grad(): m.weight = nn.Parameter(tensor) m.bias = nn.Parameter(tensor) m.running_mean = tensor m.running_var = tensor
At this time, 85% was scored in 1st epoch.
I checked all the assigned weights an biases are same each other between two method, after setting pre-trained parameters.
I don’t understand this behavior, and I wonder if there is any difference between the two methods.
Thanks for your attention.