Hi.
I’m training Mini-ImageNet datasets using VGG16 with pre-trained ImageNet parameters.
First, load_state_dict() method was used to set pre-trained parameters in network.
At this time, accuracy scored 55% in 1st epoch and it trained continuously.
Secondly, i set pre-trained parameters in network by assign weight tensor in module’s weight and bias attribute directly, like the code below. The ‘tensor’ in code block is just example.
for m in model.features.modules():
if isinstance(m, nn.Conv2d) :
with torch.no_grad():
m.weight = nn.Parameter(tensor)
m.bias = nn.Parameter(tensor)
elif isinstance(m, nn.BatchNorm2d) :
with torch.no_grad():
m.weight = nn.Parameter(tensor)
m.bias = nn.Parameter(tensor)
m.running_mean = tensor
m.running_var = tensor
At this time, 85% was scored in 1st epoch.
I checked all the assigned weights an biases are same each other between two method, after setting pre-trained parameters.
I don’t understand this behavior, and I wonder if there is any difference between the two methods.
Thanks for your attention.