Hi am using x86_inductor_qunatizer following the given tutorial. I want to implement this on FPGA. However, I am having trouble find x86 kernel c++/python implementation of these quantized qconv which are suppose to be defined in torch.ops.qconv but going /ATen/native/quantized/library.cpp wasn’t much helpful.
Here is my understanding of the whole quantized conv execution for x86_inductor.
Data types:
weights: qint8 [-128 to 127]
input output: qunit8 [0-255]
Bias: float32 converted to bias_q: vector
Multiplication: int32
Accumulation: int32
Bias addition: int32
input scale: float32
input zero point: int32
per channel scales: vector
output scale: float32
output zero point: int32
I used symmetric quantization so all zeros points are actually zero.
Here is python version in this gist. It include my understanding of bias_q from this thread
Can someone please help me sort this out, either by helping me verify the gist, i.e. ideas how can I make sure it mimics x86 kernel implementation so I can write FPGA code for it OR point me to kernel implementation of it.
I want to use x86 unlike Xnnpack because it produces [0-255] outputs so that make my life easier as my activation function is ReLU.
Hi @jerryzh168 I made some progress. I am getting almost the same result as one would get after executing graph module.
However, manual quantization execution outputs are on sum:0.1534857451915741 mean: 0.00213174638338387 max: 0.006238460540771484 min: 3.343820571899414e-05 different per output pixel with bias_quantized = True.
Can you please have a look at this gist and see if there is mistake or there some something in gemm that might be causing this difference. It works with pytorch2.4.0+cpu.
I set the quantized bias to be True as it this is what you mentioned earlier in another thread.