Training a model good enough for transfer learning

Hi everyone,

This is not related to PyTorch in particular, but since there are lots of experts and practitioners here, I would like to take your opinion about it.

I am working on transfer learning for a robotic application. Right now, we have want to train a deep learning model using our current dataset (source task), and then use the pre-trained model to bootstrap the learning of a new task - a target task -. The 2nd part I think everyone is familiar with (when using pretrained VGG or Inseption for transfer learning). The 1st part is what worries me.

The thing is:

  • I don’t find clear guideline on how to train the model (1st part, the model that is good enough for transfer learning).
  • How do I know the amount of data needed in order to have a model good enough to be used transfer learning

I’ve some data for the source task already. If we decide to go for deep learning, we will have to collect more data. While this is feasible, it is quite expensive in terms of time and money. I need a way to measure how much data we need (if the number is astronomical, maybe deep learning is not the good choice then), and if this whole idea is realistic or not.

I guess the challenge here is: while every ‘transferable’ model starts with the massive datasets existing already (top-down), I am starting from the bottom-up (how much data we need to do the job, and if this job is feasible already with deep learning).

I would very much appreciate your input on this.

Best Regards,
Omar