Dear all,
I have a Pruned Quantized MobileNet v2 model,
and is now trying to simulate its inference from scratch.
Here is a brief sample of my code:
(not the actual code, but very similar)
(m is the layer)
# scan through feature map _xin
for _bgn_y in range(_xin_h):
for _bgn_x in range(_xin_w):
all_channels = 0.0
# scan through kernel
for _kin in range(_w_input_ch):
_ftmp = 0.0
for _x in range(_w_kernel):
for _y in range(_w_kernel):
fx = torch.dequantize(xin).numpy()[args.img_in_batch][_kin][_bgn_y+_y]_bgn_x+_x]
fw = torch.dequantize(m.weight()).numpy()[_kout][_kin][_y][_x]
_ftmp += (fw*fx)
all_channel += _ftmp
out[_kout][_bgn_y][_bgn_x] = (_fall_channel + m.bias()[_kout])/m.scale + m.zero_point
I assumed the results would be exactly the same as PyTorch’s Quantized Model.
However, it’s 99.999975% the same:
Out of the 40,million Feature Map parameters, only 1 or 2 points is off by 1.
After some investigation, I’ve found the points off were .5 values,
and are all randomly distributed.
Ex:
My Calculation:
pyTorch’s Output:
72
It seems like its a rounding issue, so I have tried different Rounding methods, but all in vain.
(I’m currently using the Bankers Rounding, the one python3 uses)
Any help would be appreciated!
Thanks
Best wishes,
James