Transfer Learning is really slow

Hi guys,
I am working on Traffic sign classification with the German Traffic Sign Dataset (~38k train and ~12k for test).
I implemented a model from scratch and I want to compare it to the state of the art model.
I followed the transfer learning tutorial from Pytorch tutos and tried the pre-trained models.
I am freezing all layers, except the last one.
However, the models (resnet, densenet, inception) are really slow during training (one epoch takes more than 30 mn on GPU).
From my understanding, training these models should be fast (I am working on google colab GPU).
Any help on how to train faster?

1 Like

Well the word “slow” is relative. If you have a very big dataset, your epoch will require more iterations and (probably,depending on the variety of the data) less epochs to be trained, meanwhile if your dataset is small, you will have less iterations per epoch.

You can try other optimizers/LR etcetera to converge faster.

Ok, I see.
Because most layers were frozen and just a couple being trained, I expected the pre-trained model to be faster than my “from scratch model”.
Thanks for your answer. I will try LR scheduler.

Even if you are freezing layers, you still have to compute the full forward pass to get the signal of the non-frozen layers. You save some time since you don’t compute the full backpropagation but nothing else.

What you can do (having everything frozen), is to pre-compute the output of the last frozen layer (since that will be static epoch per epoch), but it will require lot of memory since there are no compression formats such us jpg developed for that.

Have you been able to solve the issue? As we are not computing backward pass when layers are freezed except the last layer shouldn’t it take less time when compared to training all layers?