Hi,
I want to run MLP model on my accelerator.
My plan is to add a custom jit operation that operates like the nn.Linear layer but faster by using my accelerator.
However, the accelerator only supports int8 operation, so I need some quantization as well.
How can I add the custom operation that utilizes torch.quantization?
Thanks in advance!