Getting different feature vectors from frozen layers after training

Alexey_Demyanchuk · February 10, 2021, 1:14pm

Hi, @Francisco_Yackel As far as I understand, the most probable explanation of this difference is BatchNorm layers of your backbone. Even then you make requieres_grad = False on backbone layers, forward pass through the network will update running statistics of BatchNorm layers. So, you have to apply eval() method on the BatchNorm layers as well, this will make sure running statistics will not update during training process. This thread could be off great help to you if you want more clarification on the topic.

I would also like to notice, when your fine-tune pretrained model you will probably have a bit different dataset and perhaps would like to actually update those running statistics, so BatchNorm layers could serve as intended (normalizing intermediate representations) for your data as well. Anyway, you can always try both approaches and see what works best for your task at hand.

Cheers