What is the best way to handle different training and inference backends? In this blog it says:
static quantization must be performed on a machine with the same architecture as your deployment target. If you are using FBGEMM, you must perform the calibration pass on an x86 CPU; if you are using QNNPACK, calibration needs to happen on an ARM CPU
But I can’t find anything about this in the official tutorial. How accurate is this statement? Is it true for both options(post-training calibration and quantization-aware training) or only for calibration-based one?