Is it possible to quantize less than 8-bit?

Ricky_96 · April 26, 2021, 3:09pm

Hey all. I’ve taken a look at quantization recently for my final university project. I’ve seen that apparently PyTorch support at most 8-bit quantization. But is there any way to quantize my neural network to a lower precision (e.g. 4-bit or 2-bit)? Is it impossible instead? Please respond me.

jerryzh168 · April 27, 2021, 12:59am

it’s not supported yet, responded here: Quantization aware training lower than 8-bits?