Quantized Conv Pseudo Code

vlad123 · November 4, 2020, 5:45pm

What is the pseudo code for a quantized conv1d ? I haven’t been able to work out the requantization and bias portion.

At the bottom is a conv1d model with 1x1 filter and [1] input, if that helps.

In the forums, I’ve seen the requantization scale param is defined as

requant_scale = input_scale * weight_scale / output_scale

Even with all zero_points=0, the following equation is close but not exactly correct

output = (input * weight * requant_scale + bias_quant) * model.conv.scale

Simple Model:

M(
  (quant): Quantize(scale=tensor([0.0160]), zero_point=tensor([127]), dtype=torch.quint8)
  (conv): QuantizedConv1d(1, 1, kernel_size=(1, 1), stride=(1, 1), scale=0.0021364481654018164, zero_point=0)
  (dequant): DeQuantize()
)
input=tensor([[[-2.0260]]])
input_quant=tensor([[[0]]], dtype=torch.uint8)
weight_quant=tensor([[[-128]]], dtype=torch.int8)
weight=tensor([[[-0.6016]]], size=(1, 1, 1), dtype=torch.qint8,
       quantization_scheme=torch.per_channel_affine,
       scale=tensor([0.0047], dtype=torch.float64), zero_point=tensor([0]),
       axis=0)
bias=tensor([-0.9427], requires_grad=True)
output=tensor([[[0.2756]]])

@supriyar @jianyuhuang