I wrote a UNet-like model recently, and I am trying to deploy it to Android device(ARMv8.2). Therefore, the model needs to be quantized beforehand, and I am thinking of QAT. After spending some time for research, I found that I cannot run this command on my Windows machine, which has Intel CPU.
model_fp32.qconfig = torch.quantization.get_default_qconfig('qnnpack')
So my question is, where should I run this command? Jetson Nano, Coral Dev board, or whatever boards that have arm cpu can run this, correct? Or if there is any other approach to quantize my model to fit the purpose? Can the cross-compiling do the job for QAT with qnnpack?