Fine-tuning torchvision models

Hi everyone,

I am using the pertained models from torchvision and there are 3 cases: feature extraction, fine-tuning pre-trained model, and optimise it yourself.

First case, to extract feature from images:

  • set model=resnet18(pretrained=True)
  • set param.requires_grad = False
  • and optimise only the Fully Connected layer, the last one.

Second case, to fine-tune pre-trained model:

  • set model=resnet18(pretrained=True)
  • set param.requires_grad = True
  • and optimise Convolutional, Pooling and Fully Connected layers

Third case, to optimise model from scratch :

  • set model=resnet18(pretrained=False)
  • set param.requires_grad = True
  • and optimise Convolutional, Pooling and Fully Connected layers

From my understanding, when you optimise parameters in CNN, you do it for Convolution, Pooling and Fully Connected layers. I can not get the difference between the 2nd and 3d cases. For me in both cases you optimise all layers.

Thank you for your considered time.

The number of layers you are training in case of fine-tuning depends on the amount of data you have and also how similar the new dataset is to the one used for pretraining (ImageNet in case of torchvision).
Have a look at the Fine-tuning notes of CS231n for a good overview.

1 Like


Thank you for your response.

I read this article about transfer learning and I understood how it is working. My question was how different is case 2 and and case 3.

Maybe I expressed my question in incomprehensible way. In other words, how it is possible to use pre-trained model and optimise parameters of all layers setting “set param.requires_grad = True” ? If you optimise params of all layers you train your model from scratch so you set “pretrained=False”.

You are fine-tuning all parameters using the pretrained weights as the initial parameters.
The difference is that you won’t initialize them randomly using a torch.nn.init method, but start from the already pretrained parameters.
If you are using a similar dataset to the one used for pretraining, the model parameters will be already quite good. Changing all parameters “a bit” will fine-tune the model so that it hopefully works better for your new dataset.


As always great answer! Now, it is clear.

Thanks for your time and have a great day :v: