Hello
I am trying to quantize the stacked hourglasses model used for 2d pose estimation using static post training quantization in eager mode. However, after quantization the accuracy decreases by almost 20 percent. I am trying to debug following this tutorial pytorch numeric suite, and it seems to me that there is some problem with the batch normalization layers but I am not sure about it.
This is the code I use for quantization (I know I am not fusing the layers)
def static(model, dataloader):
model.eval()
model.qconfig = torch.ao.quantization.get_default_qconfig('fbgemm')
torch.ao.quantization.prepare(model, inplace=True)
for inputs, labels, masks in dataloader:
inputs = inputs.to(device, dtype=torch.float32)
model(inputs)
torch.ao.quantization.convert(model, inplace=True)
return model
and this is part of the model after quantization
(0): SeBottleneck(
(bn1): QuantizedBatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv1): QuantizedConv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), scale=0.5101176500320435, zero_point=77)
(bn2): QuantizedBatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): QuantizedConv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), scale=1.8481943607330322, zero_point=83, padding=(1, 1))
(bn3): QuantizedBatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): QuantizedConv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), scale=0.13069623708724976, zero_point=92)
(se): SeBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): QuantizedLinear(in_features=128, out_features=8, scale=0.05465518683195114, zero_point=33, qscheme=torch.per_channel_affine)
(1): ReLU(inplace=True)
(2): QuantizedLinear(in_features=8, out_features=128, scale=0.057319797575473785, zero_point=127, qscheme=torch.per_channel_affine)
(3): Sigmoid()
)
(mul): QFunctional(
scale=0.021492326632142067, zero_point=83
(activation_post_process): Identity()
)
)
(downsample): Sequential(
(0): QuantizedConv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), scale=0.025261757895350456, zero_point=71)
)
(add): QFunctional(
scale=0.030607327818870544, zero_point=62
(activation_post_process): Identity()
)
)
when I compare the output of the quantized modules with the output of the original modules using the function compare_model_stub() and evaluating the error using the compute_error() function in the pytorch numeric suite tutorial, I observe that there is a big difference between batch normalization layers and all the other layers of the network.
This are the numbers for some of the batch normalization layers
hg.0.hg.2.1.3.bn2.stats tensor(1.2797)
hg.0.hg.2.1.3.bn3.stats tensor(0.1870)
hg.0.hg.2.2.0.bn1.stats tensor(3.4588)
hg.0.hg.2.2.0.bn2.stats tensor(1.0052)
hg.0.hg.2.2.0.bn3.stats tensor(0.8627)
hg.0.hg.2.2.1.bn1.stats tensor(3.1735)
hg.0.hg.2.2.1.bn2.stats tensor(0.8259)
hg.0.hg.2.2.1.bn3.stats tensor(1.1860)
hg.0.hg.2.2.2.bn1.stats tensor(3.0059)
hg.0.hg.2.2.2.bn2.stats tensor(0.7214)
and these are the numbers for some of the convolutional layers
`hg.1.hg.3.1.1.conv2.stats tensor(33.3547)
hg.1.hg.3.1.1.conv3.stats tensor(28.7391)
hg.1.hg.3.1.2.conv1.stats tensor(31.8503)
hg.1.hg.3.1.2.conv2.stats tensor(32.0367)
hg.1.hg.3.1.2.conv3.stats tensor(30.5304)
hg.1.hg.3.1.3.conv1.stats tensor(32.4438)
hg.1.hg.3.1.3.conv2.stats tensor(34.6818)
hg.1.hg.3.1.3.conv3.stats tensor(30.7412)
hg.1.hg.3.2.0.conv1.stats tensor(32.4388)`
is there any problem with batch normalization or it is normal that the numbers are so low? what could be the problem?
Thank you