I have been trying to perform static quantization of a model for scene text detection and reduce the precision of the model from float32 to integer8.
I followed the steps mentioned in the official tutorial - fused the conv,batch norm and relu layers ; converted to int ; replaced concat with nn.quantized.Floattensor().
The original model works well, but the prediction from quantized model is really bad -
Prediction from FP32 model -
Prediction from INT8 model -
Not sure where I’m going wrong
Would appreciate on any suggestions where I’m going wrong during quantization.
The code is available here - https://github.com/raghavgurbaxani/experiments/blob/master/try_static_quant.py