Conv2d_unpack and conv2d_prepack behavior

Hey I have a question about


Why returned weight in int8 and bias in fp32? How can i convert bias to fixed point?

Also i wanted to understand how torch.ops.quantized.conv2d_prepack this function is structured and what how packed_params is created.

1 Like

Bias is kept in fp32 format for eager mode quantization and dynamically quantized while computing quantized FC/Conv. It’s returned in fp32 because that’s how it’s passed in to an operator as well. The reason for keeping bias in fp32 is the unavailability of input scale until the operator has executed so we can’t quantize bias until then.

To convert bias to quantized format, use input_scale * weight_scale with a zero_point = 0. See this code for converting bias with act_times_weight scale.

Check out the code in file for prepack function. If USE_FBGEMM is true, fbgemm_conv_prepack function is called for doing prepacking.