Sub-8 bit quantization

zlou · May 5, 2020, 12:47am

I am trying to simulate sub-8 bit quantization. Currently, I am only doing post-training quantization, and I am doing so by creating a custom observer which is identical to the existing HistogramObserver except that the qmin and qmax values are changed to match the new bit width.
Running this with ResNet-50, with 4 bit activations and 8 bit weights, the top 1 accuracy is around 10%, which is significantly lower than the results reported in the whitepaper by Krishnamoorthi (36%). Are there other changes which need to be made to simulate this quantization, or does this come down to some kind of TF vs. Pytorch difference?

jerryzh168 · May 8, 2020, 6:03pm

cc @raghuramank100 @hx89