Custom operation with quantization

zzoon91 · July 28, 2020, 3:29pm

Hi,

I want to run MLP model on my accelerator.
My plan is to add a custom jit operation that operates like the nn.Linear layer but faster by using my accelerator.
However, the accelerator only supports int8 operation, so I need some quantization as well.

How can I add the custom operation that utilizes torch.quantization?

Thanks in advance!