Float16 dynamic quantization has no model size benefit

huoge · October 17, 2020, 2:27am

Hello everyone. I recently use dynamic quantiztion to quant the model, when use torch.quantization.quantize_dynamic(model, dtype=torch.qint8) to quant the model, model from 39M to 30M, while use torch.quantization.quantize_dynamic(model, dtype=torch.float16) the model size has no changes. Does anybody know why? or Do I do the wrong way to quantize model to float16?

I’d appreciate if anybody can help me! Thanks in advance!

supriyar · October 28, 2020, 10:14pm

Hi @huoge - for int8 dynamic quantization we quantize the weights to 8-bits so you see the expected size reduction.
For fp16 quantization, the weight values are cast to fp16 (taking saturation into account), but the dtype is still set to float32. This has to do with the type expected by the FBGEMM backend when performing the gemm operation.