Quantizaton of EfficientNet Models

anonymous1 · June 27, 2021, 9:10am

Hi,
I performed the quantization technique on efficient net models by referring post-training static quantization method in PyTorch blogs. But I was only able to bring a reduction only by 5 MB.
Also, I wasn’t able to perform the layer fusion step on the prebuilt layers of this model while quantizing using the existing PyTorch techniques. How do I approach this problem? Or is there an alternative method to bring down the size of the model without affecting its accuracy much?
Can someone help me with this?

Thanks in advance!

jerryzh168 · November 5, 2021, 10:23pm

It might be related to fusion, why can’t you do fusion?