I’m loading the resnet 50 from torch hub, cutting off the fc layers (last two nn modules), keeping the base, and creating a custom head on it thats the exact same head as the original resnet (created same head to debug size mismatch error).
But I am still facing the size mismatch error. I do not understand how, because when I print out the layers, it is exactly the same.
I’ve attached code to reproduce result
import torch
x=torch.randn(([2, 3, 50, 50]))
model = torch.hub.load('pytorch/vision:v0.5.0', 'resnet50', pretrained=True)
print(model)
# feature extraction layer
base_model_list = list(model.children())[:-2]
# same layers as loaded resnet
layers= base_model_list + list(model.children())[-2:]
new_model = nn.Sequential(*layers)
print(new_model)
model(x) # executes with no error
new_model(x) # runtime error: size mismatch, m1: [4096 x 1], m2: [2048 x 1000]
Anyone know how? Is there a hidden operation that’s getting deleted when I separate the base? Because the layers are the same when printed
I read the above comment after seeing a lot of people suggest on the web to do transfer learning with Sequential (*model.children()) approach. It was still unclear to me whether that approach is the correct one for only finetuning the last layer. After seeing https://github.com/pytorch/pytorch/issues/15129 it made sense that this is not the correct approach, you cannot express resnet by only its .children(). I just wanted to call that out explicitly to save some time. I ended up just mutating the last layer with resnet.fc = nn.Linear(num_ftrs, my_num_classes) .