I am using FX quantization with a custom backend and custom layers.
Quantization with qint8
is working well. However, when I tried to quantize the model using qint32
, the layer was not quantized during the convert_fx
step.
Upon investigation, I found that the issue is due to the absence of torch.qint32
in the torch.ao.quantization.utils.weight_is_quantized
function.
def weight_is_quantized(qconfig):
""" Given a qconfig, decide if the weight needs to be
quantized or not
"""
return weight_dtype(qconfig) in [
torch.quint8,
torch.qint8,
torch.float16,
torch.quint4x2,
torch.uint8,
torch.int8,
torch.int16,
# torch.qint32 <<< this is missing.
]
Is there a specific reason why qint32 is not included as a target for weight quantization?