Is bias quantized when I run pt2e quantization?

Gurkirt · August 16, 2024, 2:34pm

Hi there, I came across your thread while searching for kernel implementation of qconv2d with onednn. Anyways, you seem to be in place where I was few weeks ago. Xnnpacks default datatypes are defined in pytorch/torch/ao/quantization/quantizer/xnnpack_quantizer.py at main · pytorch/pytorch · GitHub

Specifically bias is float32.

BTW: Also accumulation is int32.
FBGEMM backend quantize the bias as follows

bias_q = bias_q = round(bias/(input_scale*weight_scale)) 

z_int = z + bias_q

You can seem more details in this thread

hope that helps