Access the 32int MAC value

rawansn · October 1, 2021, 1:03am

The pytorch quantization converts the INT32 MAC value into 8b in the backend. How can we access this compute layer INT32 value prior to conversion.

jerryzh168 · October 1, 2021, 5:00am

unfortunately int32 value for fbgemm/qnnpack is not accessible from outside. Why do you need this?

rawansn · October 1, 2021, 5:24am

I need to impose some limits on the Accumulated values, how would that be feasible in the quantized model format?

jerryzh168 · October 1, 2021, 6:21am

can you describe the whole flow? Are you trying to impose the limit on the kernel level or when people train the model?

If you need a kernel that imposes some limits on the int32 value, then I think the best thing to do is to reimplement the kernel (possibly by calling fbgemm implementations if you need high performance: GitHub - pytorch/FBGEMM: FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/)

jerryzh168 · October 1, 2021, 6:32am

actually you might be able to modify the operator implementation a little bit and implement your own version of ops like quantized::conv: pytorch/qconv.cpp at master · pytorch/pytorch · GitHub