FasterRCNN transfer learning strategy

loicdtx · February 11, 2020, 2:20pm

Hi everyone,

I’m trying to train a torchvision faster RCNN object detection model using transfer learning.
My dataset consists of tree species from overhead imagery; therefore it is widely different from the coco dataset on which the pretrained models are trained.

I am looking here for recommendations/advices about the transfer learning strategy to adopt.
For my first attempts (on a relatively small dataset) I tried freezing the backbone and training only head + rpn, but the results are not very good.

Should I instead make all layers trainable from start, or train different components separately (e.g.: first only the head, then only the rpn and finally head + rpn + backbone with fpn).

Thanks for any advice,
Loïc

ptrblck · February 12, 2020, 7:45am

Generally you could try to retrain more parameters the more data you have and especially if it comes from another domain.
If you are dealing with a small dataset, training too many parameters might yield to overfitting.
Unfortunately, my best advice is that you would have to try it out and see what is working best for your use case.

maldeer · June 19, 2020, 3:24am

@ptrblck
How to retrain more parameters? this might be a simple thing to do, but please forgive me as I am new in this field.
In fact I have the same problem. I have a very small data set (100 recurrence plots) and I used this nice tutorial (https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html) to do transfer learning. Using mobilenet (for example) and with 100 epochs and dropout of 0.t I get 100 as training Acc. and 70% Val. Acc.

Any other suggestions to avoid overfitting?
Thank you

ptrblck · June 19, 2020, 7:28am

If you want to train more parameters, you should freeze less parameters from the complete model.
I assume you might be currently freezing all parameters besides the last classifier?

However, since you are dealing with a small dataset, training more parameters might even result in more overfitting.
You could try to apply more aggressive data augmentation or in the best case try to collect more samples.