Should I use a model class for a fined tuned model?

maria · December 22, 2019, 3:40pm

I created a fine-tuned model based on resnet18:

resnet18 = torch.hub.load('pytorch/vision:v0.4.2', 'resnet18', pretrained=True)
reset18_base = torch.nn.Sequential(*(list(resnet18.children())[:-2]))
head_layers = [some layers.. ]
resnet_head = nn.Sequential(*head_layers)
model = nn.Sequential(reset18_base, resnet_head)

I trained it and it works well, now I need to save and load it elsewhere.
Should I use this code for the model object (to load a state dict into) instead of a Class containing an init and a forward?
I’m wondering about best practices in cases such as mine.

ptrblck · December 23, 2019, 6:43am

I would say it depends on your future use cases.
If you think this code snippet with nn.Sequential blocks will work in your scripts, just go for it.
On the other hand, if you want to change something in the forward pass, e.g. add some additional residual connections, change or skip some layers, etc., I would rather implement an nn.Module class to get the full flexibility and hackability.

maria · December 23, 2019, 12:40pm

Thank you
If I choose to implement a module, how can I use the loaded weights from the pretrained ResNet? Is there a way to use the weights without the names matching?

ptrblck · December 23, 2019, 7:05pm

I would create the custom module, load the resnet inside and write the forward method as you wish.
Once this module works as you want it to, save the state_dict and reload this one.
This approach will make it easier to use your custom module without having to map the original resnet parameters to your custom layers.