Changing quant_max and quant_min doesn't have any effect

mg1371 · April 17, 2020, 12:46am

I’m trying to have a more limited range for values in my model. I set quant_max and quant_min to the new values as an argument in FakeQuantize, but when I actually print int_repr() of values, it’s still between 0-255 (or -128 to 127).

jerryzh168 · April 17, 2020, 8:48pm

right now the observer uses fixed range depending on dtype: https://github.com/pytorch/pytorch/blob/master/torch/quantization/observer.py#L153-L162, feel free to add the support for quant_min and quant_max for observer.
cc @raghuramank100 for sub 8 bit observer support.

raghuramank100 · April 18, 2020, 1:40am

Yes, we have a PR in the works for supporting sub 8 bit quantization at: https://github.com/pytorch/pytorch/pull/33743, expect to land this end of next week