Pretrained Model after fine-tuning taking longer inference time

I am training a segmentation model from https://github.com/qubvel/segmentation_models.pytorch and I first load the model with pretrained imagenet weights and the inference time on cpu is around 1 second. After I load my fine-tuned weights, the inference time jumps to about 10 seconds.

I checked the dtype of both, and both are float32. So what is going on? Why are fine-tuned weights taking much longer? Are the image-net weights pre-quantised or something?

Could you use torch.set_flush_denormal(True) to flush the denormals and see, if your performance increases again?

It did become a tiny bit faster but there is still a lag big between the two models. I manually checked the weighhts and it seems that the weights of both are only till 4 decimal places… Any other suggestions?

@ptrblck please reply we are trying to deploy something :sweat_smile:

Unfortunately, I don’t know what else could slow down the CPU. :confused:
I assume that you are using exactly the same setup and just load the weights?

Yes, its happening on colab and my coworkers pc as well