Hi! I’m starting to study about the implementation of quantized models in FPGA. Although I’m still learning, I would like to know if I can use PyTorch quantization for this.
I mean, I’d like to train a simple CNN model using PyTorch, quantize it to integers, and save the quantized weights and biases to a file, so I can load it later into the same CNN implemented manually on the FPGA.
So I believe I will need to implement Post-Training Static Quantization in the trained model, as shown on this page, but I’m not quite sure what to do with the weights and biases after the PyTorch quantization process. That way, when I try to check the weights of the layers after quantization, they still appear to be float, but now they also have scale and zero_point values. How can I use this information to effectively have the weights and biases as integer values for future manual implementation of the model in FPGA?
Also, I would really appreciate it if anyone has any tips or suggestions for this kind of hardware implementation.
Thank you very much.