Bias is kept in fp32 format for eager mode quantization and dynamically quantized while computing quantized FC/Conv. It’s returned in fp32 because that’s how it’s passed in to an operator as well. The reason for keeping bias in fp32 is the unavailability of input scale until the operator has executed so we can’t quantize bias until then.