and thanks in advance for all the support you provide in this channel.
I have rewritten the Bottleneck of torchvision resnet50 using exactly the same layers I can see when I print the resnet50 architecture and I replaced each original bottleneck with mine. The resnet50 is the backbone of faster rcnn I am using for obj detection.
Since the original resnet50 was used “pretrained” I copied pretrained weights (inclusive of bias and running_mean/running_var whenever present), from a pretrained backbone into mine, layer by layer, in the following way:
What I expected was an identical behaviour during training and similar final results in terms of recall, spec, F1 score but it is not the case. Training proceeds very slowly and the loss decreases less compared to the original resnet50 backbone.
Can you explain what is either missing or wrong out of your experience or even better provide a working snippet of code to safely copy pretrained weights etc?
Thanks a lot for your attention,