Fine Tuning a model in Pytorch

ankghost0912 · June 22, 2017, 8:40pm

Hi,

I’ve got a small question regarding fine tuning a model i.e. How can I download a pre-trained model like VGG and then use it to serve as the base of any new layers built on top of it. In Caffe there was a ‘model zoo’, does such a thing exist in PyTorch?

If not, how do we go about it?

smth · June 22, 2017, 9:47pm

this section will help you (yes we have a big modelzoo)
http://pytorch.org/docs/torchvision/models.html

Also, this thread

ankghost0912 · June 22, 2017, 10:07pm

Hey thanks,

I went through that thread and got a pretty good idea of how to fine-tune stuff, however is there a way to manually remove a layer of the pre-trained network? From what I’ve learnt, even if I set the required_grad in the layers I don’t want in my graph, they still will have the pre-trained weights in them. So essentially I want to do something like this(in pseudo-code):

`
vgg = models.vgg16(pretrained=True)

remove the last two pooling layers

vgg.layer1 = nn.Conv2d(…dilation=2)
`
Any advice?

SpandanMadan · August 23, 2017, 6:30am

Hey! Not sure if you’re still stuck on this. But I just wrote a short tutorial on fine tuning resnet in pytorch. Here - https://github.com/Spandan-Madan/Pytorch_fine_tuning_Tutorial

Hope this helps!

ankghost0912 · August 23, 2017, 3:01pm

Hi,

That’s a nice tutorial. However, finetuning in PyTorch is about understanding what a computational graph does. So after reading the docs, it is clear that in order to finetune the gradients must not be backpropagated back to the pre-trained weights. So if you’re finetuning off say vgg you can do something like this:

vgg_base = models.vgg16(pretrained=True)
net.feat = nn.Sequential(*list(vgg_base.features.children()))
net.clf = nn.Linear(in_feat_1, out_feat_1)
net.op = nn.Linear(out_feat_1, out_feat_2)

for params in net.feat.parameters():
    params.requires_grad = False 

optim = optim.SGD(chain(net.clf.parameters(), net.op.parameters()), lr=lr, momentum=0.9)

I hope that adding this to your tutorial may help.

SpandanMadan · August 23, 2017, 6:19pm

Usually, it’s a matter of choice in fine tuning to decide how many layers are frozen. Most people tend to freeze most layers because it slows the system down. Ideally, it’s best if layers are not frozen. Which is why I left it like that on purpose!