Object Detection Quantization in PyTorch

How quantization for object detection models varies from that of classification models?
Since detection models need to handle the bbox coordinates(multiple objects in an input), there must be some scaling trick in quantization.
Is there any implementation sources?

We have usually quantized the backbone part for detection models while leaving the rest in fp32 and gotten good speedups. For the other part, @Zafar has tried quantizing but the accuracy is usually bad.


If we employ MinMax observer for calibrating the floating model for quantization,how are the bounding box coordinates quantized? Does it follow same way as of feature extraction?

Yes, MinMax observer will operate the same way if you’re using it for bounding box co-ordinates. It calculates the scale and zero-point of the tensor based on min and max values.

1 Like