Hello
I think I came across some problem when selecting quantization backend. From the documents I know pytorch select “x86” by default, but I was learning “fbgemm” code recently, so I change it to “fbgemm”. But the code still use oneDNN as backend, I don’t know why?
My env: Mac M1
Pytorch version: 2.0.0
here is my code:
model_fp32_fused.qconfig = torch.ao.quantization.get_default_qconfig('fbgemm')
model_fp32_prepared = torch.ao.quantization.prepare(model_fp32_fused, inplace=True)
model_int8 = torch.ao.quantization.convert(model_fp32_prepared)
model_int8.eval()
inputs = torch.randn(1, 1, 59, 13)
output = model_int8(inputs)