Model parameters and MACs

Dear All,

        I had trained and saved a model with its parameters by default in float32. Upon loading the saved model and doing model.half(), I was able to reduce the model precision to float16 (half). But the number of parameters and MACs obtained using the profile(model, inputs = (,)) from thop framework remains the same for both float32 and float16. Kindly, explain this. Is my understanding of quantization wrong?

Transforming the dtype will not change the needed operations and thus the reporting sounds reasonable. Internal algorithms could change of course.

Thanks for the response, sir. Can you please provide a possible direction towards cutting down MACs and PARAMs?

You could run experiments by reducing the actual layer size and check how it would affect the accuracy of your model.

1 Like