I have a model whose last layers are like seen below
' ' ' out = self.relu(self.bn1(out)) out = F.avg_pool2d(out, 8) out = out.view(-1, self.nChannels) out = self.fc(out) return out
The last layer’s output is of
torch size [18, 2048]
I have trained the model and have a .pth file, I would like to do finetuning/transfer learning for a downstream task by adding a Linear layer and then train.
What is the right way to do this?
Do I initialize the whole network first and thereafter load the model_state_dict or just load the model from path?
Essentially, I would like to have something like this for the transfer learning/downstream training:
import torch.nn as nn model = Net(args1, args2, args3) model = model.load_state_dict(torch.load(PATH)) model.eval() class final_network(nn.Module): def __init__(self, use_face=False, num_glimpses=1): super(final_network, self).__init__() self.pre_network = model self.final_task = nn.Sequential( nn.Linear(2048, 468), ) def forward(self, x): feature = self.pre_network(x) feature = feature.view(feature.size(0), -1) out = self.final_task(feature) return out
Do I use the entire model (checkpoint) as seen here Saving and loading models for inference in PyTorch — PyTorch Tutorials 1.12.1+cu102 documentation ? I am a little confused as all the info found online are for popular network architecture.