How to set quantization aware training scaling factors?

jerryzh168 · August 25, 2022, 11:23pm

for converting the model to 8bit FPGA, I think you might need to follow the reference flow, which is only available in fx graph mode quantization right now, please take a look at rfcs/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md at master · pytorch/rfcs · GitHub, you will get a model with q/dq/fp32 ops which represents a quantized model, and you can lower the model to FPGA (I guess you need to expose ops implemented in FPGA in pytorch?) lowering code for native pytorch backend (fbgemm/qnnpack) can be found in pytorch/_lower_to_native_backend.py at master · pytorch/pytorch · GitHub