Hi there, I came across your thread while searching for kernel implementation of qconv2d with onednn. Anyways, you seem to be in place where I was few weeks ago. Xnnpacks default datatypes are defined in pytorch/torch/ao/quantization/quantizer/xnnpack_quantizer.py at main · pytorch/pytorch · GitHub
Specifically bias is float32.
BTW: Also accumulation is int32.
FBGEMM backend quantize the bias as follows
bias_q = bias_q = round(bias/(input_scale*weight_scale))
z_int = z + bias_q
You can seem more details in this thread
hope that helps