Time per epoch same for pretrained model and training from scratch

lorenzo_fabbri · July 21, 2019, 5:25pm

I’m using ResNet18 for a classification task.
I tried using the pretrained model and training it from scratch: one thing that bothers me is that the time per epoch is more or less the same for both. On my GPU (GTX 1060), it takes roughly 19 minutes if I use the pretrained model, and around 23 minutes if I train it from scratch. For the pretrained model, I just re-initialize the first convolutional layer and the FC layer.
Is this expected?

ptrblck · July 21, 2019, 5:44pm

If you train all layers, it’s obviously expected, since you won’t save any computations.
However, if you freeze some layers, you might save the backward computation, since you don’t need the gradients before a certain layer.

In your use case, it seems you are retraining the input layer, thus the backward pass has to go through the complete model up to the input layer, even if the intermediate layers do not require gradients.

lorenzo_fabbri · July 21, 2019, 6:04pm

Well, yes that makes sense.
Unfortunately my images have 6 channels so the only way I found is to replace the input layer to accept 6 channels. I don’t think it would make sense the initialize it with the weights of ResNet since they are not RGB images.
Thank you.