Hello all, hope you are having a great day.
I quantized a model using Graph mode post-training static quantization and everything seems to have gone smoothly without a hitch.
However, upon loading the newly quantized model and trying to do a forward I get this error :
Evaluating data/angles.txt...
0%| | 0/6000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/mnt/internet/shishosama/embeder_moder_training/graph_quantizer_static.py", line 119, in <module>
lfw_test(jit_model)
File "/mnt/internet/shishosama/embeder_moder_training/lfw_eval.py", line 350, in lfw_test
evaluate(model)
File "/mnt/internet/shishosama/embeder_moder_training/lfw_eval.py", line 111, in evaluate
output = model(imgs)
File "/root/anaconda3/envs/shishosama/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/__torch__/models_new/___torch_mangle_1853.py", line 23, in forward
input_2_quant = torch.quantize_per_tensor(input, 0.037445519119501114, 57, 13)
_0 = getattr(self, "quantized._jit_pass_packed_weight_0")
_1 = ops.quantized.conv2d_relu(input_2_quant, _0, 0.0094706285744905472, 0)
~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_6_dequant = torch.dequantize(_1)
input0 = torch.feature_dropout(_6_dequant, 0., False)
Traceback of TorchScript, original code (most recent call last):
graph(%a_quant, %packed_params, %r_scale, %r_zero_point, %r_dtype, %stride, %padding, %dilation, %groups):
%r_quant = quantized::conv2d_relu(%a_quant, %packed_params, %r_scale, %r_zero_point)
~~~~~~~~~ <--- HERE
return (%r_quant)
RuntimeError: Could not run 'quantized::conv2d_relu.new' with arguments from the 'QuantizedCUDA' backend. 'quantized::conv2d_relu.new' is only available for these backends: [QuantizedCPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, Tracer, Autocast, Batched, VmapMode].
QuantizedCPU: registered at /pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:858 [kernel]
BackendSelect: fallthrough registered at /pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
Named: registered at /pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
AutogradOther: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:35 [backend fallback]
AutogradCPU: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:39 [backend fallback]
AutogradCUDA: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:43 [backend fallback]
AutogradXLA: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:47 [backend fallback]
Tracer: fallthrough registered at /pytorch/torch/csrc/jit/frontend/tracer.cpp:967 [backend fallback]
Autocast: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:254 [backend fallback]
Batched: registered at /pytorch/aten/src/ATen/BatchingRegistrations.cpp:511 [backend fallback]
VmapMode: fallthrough registered at /pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
What am I missing here? previously my dynamically quantized model didn’t have this issue (they were also quantized using graph mode)
so I’m not sure what’s happening here. I also get this exact error when I try to do a forward pass using the model quantized using the eager mode (here is its own thread)
In case the quantized model is of some use, here it is: https://gofile.io/d/zyDEaY
Any help is greatly appreciated.