Which Machine for QAT using QNNPACK backend?

dalseeroh · November 26, 2021, 5:31am

Hello folks,

I wrote a UNet-like model recently, and I am trying to deploy it to Android device(ARMv8.2). Therefore, the model needs to be quantized beforehand, and I am thinking of QAT. After spending some time for research, I found that I cannot run this command on my Windows machine, which has Intel CPU.

model_fp32.qconfig = torch.quantization.get_default_qconfig('qnnpack')

So my question is, where should I run this command? Jetson Nano, Coral Dev board, or whatever boards that have arm cpu can run this, correct? Or if there is any other approach to quantize my model to fit the purpose? Can the cross-compiling do the job for QAT with qnnpack?

marksaroufim · February 10, 2022, 10:29pm

You need to quantize the model on the device you want the model to eventually run. I’d start with Dynamic quantization, it’s easier to get working than QAT