How to use own model for transfer learning

exponential · September 14, 2022, 5:27pm

I have a model whose last layers are like seen below

'
'
'
out = self.relu(self.bn1(out))
out = F.avg_pool2d(out, 8)
out = out.view(-1, self.nChannels)
out = self.fc(out)
return out

The last layer’s output is of torch size [18, 2048]
I have trained the model and have a .pth file, I would like to do finetuning/transfer learning for a downstream task by adding a Linear layer and then train.
What is the right way to do this?
Do I initialize the whole network first and thereafter load the model_state_dict or just load the model from path?

Essentially, I would like to have something like this for the transfer learning/downstream training:

import torch.nn as nn

model = Net(args1, args2, args3)
model = model.load_state_dict(torch.load(PATH))
model.eval()

class final_network(nn.Module):
    def __init__(self, use_face=False, num_glimpses=1):
        super(final_network, self).__init__()
        self.pre_network = model

        self.final_task = nn.Sequential(
            nn.Linear(2048, 468),
        )

    def forward(self, x):
        feature = self.pre_network(x)
        feature = feature.view(feature.size(0), -1)
        out = self.final_task(feature)

        return out

Do I use the entire model (checkpoint) as seen here Saving and loading models for inference in PyTorch — PyTorch Tutorials 1.12.1+cu102 documentation ? I am a little confused as all the info found online are for popular network architecture.

Many thanks.

ptrblck · September 15, 2022, 12:03am

I would recommend to initialize the model objects first, then to load their state_dicts. Storing and loading the model directly can break in many different ways and I would thus not use it.

Your code snippet looks alright, but I would also explicitly pass model as an argument to final_network.__init__.