Custom operation with quantization


I want to run MLP model on my accelerator.
My plan is to add a custom jit operation that operates like the nn.Linear layer but faster by using my accelerator.
However, the accelerator only supports int8 operation, so I need some quantization as well.

How can I add the custom operation that utilizes torch.quantization?

Thanks in advance!