I am training a segmentation model from https://github.com/qubvel/segmentation_models.pytorch and I first load the model with pretrained imagenet weights and the inference time on cpu is around 1 second. After I load my fine-tuned weights, the inference time jumps to about 10 seconds.
I checked the dtype of both, and both are float32. So what is going on? Why are fine-tuned weights taking much longer? Are the image-net weights pre-quantised or something?
Could you use torch.set_flush_denormal(True)
to flush the denormals and see, if your performance increases again?
It did become a tiny bit faster but there is still a lag big between the two models. I manually checked the weighhts and it seems that the weights of both are only till 4 decimal places… Any other suggestions?
@ptrblck please reply we are trying to deploy something
Unfortunately, I don’t know what else could slow down the CPU.
I assume that you are using exactly the same setup and just load the weights?
Yes, its happening on colab and my coworkers pc as well