Traceback (most recent call last):
File "C:\PycharmProjects\MyCNN\mymodel.py", line 219, in <module>
torch.backends.quantized.engine = 'qnnpack'
File "C:\Anaconda3\envs\deeplearning\lib\site-packages\torch\backends\quantized\__init__.py", line 29, in __set__
torch._C._set_qengine(_get_qengine_id(val))
RuntimeError: quantized engine QNNPACK is not supported
If I set backend='fbgemm'
It works, but this is the backend for the x86 server, which will not be compatible to Android environment, correct?
This probably means that the machine you are doing quantization on does not support QNNPACK. Could you share what machine and envoronment you are using, and what PyTorch version?
Are you building pytorch for windows on your machine or are you cross compiling for android? I see that you want to deploy on android so there qnnpack is definitely supported. But on windows for OSS I am not sure. I will check and get back to you.
“Since these libraries are architecture-dependent, static quantization must be performed on a machine with the same architecture as your deployment target. If you are using FBGEMM, you must perform the calibration pass on an x86 CPU (usually not a problem); if you are using QNNPACK, calibration needs to happen on an ARM CPU (this is quite a bit harder).”
“Since these libraries are architecture-dependent, static quantization must be performed on a machine with the same architecture as your deployment target. If you are using FBGEMM, you must perform the calibration pass on an x86 CPU (usually not a problem); if you are using QNNPACK, calibration needs to happen on an ARM CPU (this is quite a bit harder).”
That quote does not seem to be accurate. In general, you can use QNNPACK on x86, this is a widely used functionality on Meta where models are calibrated on linux machines with QNNPACK for inference on arm. I think the issues you are hitting on your machine might be specific to your environment.