Trying to quantize my model using this documentation Backend/Hardware Support and get this error - “RuntimeError: x86 is not a valid value for quantized engine
”.
I followed this instruction:
# set the qconfig for PTQ
# Note: the old 'fbgemm' is still available but 'x86' is the recommended default on x86 CPUs
qconfig = torch.ao.quantization.get_default_qconfig('x86')
# or, set the qconfig for QAT
qconfig = torch.ao.quantization.get_default_qat_qconfig('x86')
# set the qengine to control weight packing
torch.backends.quantized.engine = 'x86'
Using this model:
from torch.quantization import QuantStub, DeQuantStub
class QuantizedSiameseNetwork(nn.Module):
"""
Siamese network for image similarity estimation.
The network is composed of two identical networks, one for each input.
"""
def __init__(self, embedding_size=2, use_quant=False):
super().__init__()
self.backbone = Backbone(embedding_size)
self.use_quant = use_quant
if self.use_quant:
self.quant_m = QuantStub()
self.dequant_m = DeQuantStub()
And this code is how I tried to implement it:
model_fp32 = QuantizedSiameseNetwork(2, True).to('cpu')
backend = "x86"
model_fp32.qconfig = torch.quantization.get_default_qconfig(backend)
torch.backends.quantized.engine = backend
model_static_quantized = torch.quantization.prepare(model_fp32, inplace=False)