From PyTorch official quantization tutorial:
In the following setting, we did torch.per_tensor_symmetric
. However, the zero point for QuantizedConv2d
is 63 instead of 0. At first, I thought it is due to that the kernel size is 1 x 1
. We could not use a scale factor of 0 and a zero point of 0 for symmetric quantization.
QConfig(activation=functools.partial(<class 'torch.quantization.observer.MinMaxObserver'>, reduce_range=True), weight=functools.partial(<class 'torch.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric))
Post Training Quantization Prepare: Inserting Observers
Inverted Residual Block:After observer insertion
Sequential(
(0): ConvBNReLU(
(0): ConvReLU2d(
(0): Conv2d(
32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32
(activation_post_process): MinMaxObserver(min_val=inf, max_val=-inf)
)
(1): ReLU(
(activation_post_process): MinMaxObserver(min_val=inf, max_val=-inf)
)
)
(1): Identity()
(2): Identity()
)
(1): Conv2d(
32, 16, kernel_size=(1, 1), stride=(1, 1)
(activation_post_process): MinMaxObserver(min_val=inf, max_val=-inf)
)
(2): Identity()
)
..........Post Training Quantization: Calibration done
Post Training Quantization: Convert done
Inverted Residual Block: After fusion and quantization, note fused modules:
Sequential(
(0): ConvBNReLU(
(0): QuantizedConvReLU2d(32, 32, kernel_size=(3, 3), stride=(1, 1), scale=0.1516050398349762, zero_point=0, padding=(1, 1), groups=32)
(1): Identity()
(2): Identity()
)
(1): QuantizedConv2d(32, 16, kernel_size=(1, 1), stride=(1, 1), scale=0.17719413340091705, zero_point=63)
(2): Identity()
)
Size of model after quantization
Size (MB): 3.631847
..........Evaluation accuracy on 300 images, 66.67
Later, in some of my own experiments, I found for some QuantizedConv2d
whose kernel size is 3 x 3
, there could be non-zero zero points for symmetric quantization as well. How to understand the zero point in this context? Thank you.