I had trained and saved a model with its parameters by default in float32. Upon loading the saved model and doing model.half(), I was able to reduce the model precision to float16 (half). But the number of parameters and MACs obtained using the profile(model, inputs = (,)) from thop framework remains the same for both float32 and float16. Kindly, explain this. Is my understanding of quantization wrong?