I’m trying to quantise pre-trained models from torchvision, and I’ve hit an obstacle I 'm struggling to get past. I’ve been able to fuse layers and replace relus as needed, I’ve then used Quantwrapper to get the quant and dequant around the forward function and then I can prepare and convert using the quantisation tools.
The problem I’m at now is that I get an error every time I try to run a model with an operation that needs to be quantised. For example inceptionV3 uses a some operations during the forward pass and errors will be like:
'RuntimeError: Didn’t find kernel to dispatch for operator ‘aten::mul’. Tried to look up kernel for dispatch key ‘QuantizedCPUTensorID.’
I have bee able to run these networks by editing the model to replace operations with nn.quantised.FloatFunctional() operations, is there any way I can get past this without editing each network individually?
Keeping an eye on release notes here (https://github.com/pytorch/pytorch/releases) is one way to keep up to date on this. Another more involved way is checking quantization related diffs.
It’s hard to say what’s wrong here without looking at the code.