After I was guided how to deploy quantized models on mobile I’ve decided to give a try to quantized TorchScript model.
What I have is EfficientNet backbone that was quantized with QAT tools and qnnpack config. After quantization I’ve traced it with
torch.jit.script and saved for later deploy. Also I have traced model without quantization
I’ve tried both on mobile.
In the second case everything works well. But in the first case I’m obtaining errors:
HWHMA:/data/local/Detectron2Mobile # ./speed_benchmark_torch --model=./ts_models/orig_backbone.pt --input_dims="1,3,512,768" --input_type=float --warmup=10 --iter=10 Starting benchmark. Running warmup runs. Main runs. Main run finished. Milliseconds per iter: 2146.44. Iters per second: 0.465887 /speed_benchmark_torch --model=./ts_models/quant_backbone.pt --input_dims="1,3,512,768" --input_type=float --warmup=10 --iter=10 < terminating with uncaught exception of type torch::jit::ErrorReport: Unknown builtin op: quantized::batch_norm. Here are some suggestions: quantized::batch_norm2d quantized::batch_norm3d The original call is: ...<calls>... Serialized File "code/__torch__/torch/nn/quantized/modules/batchnorm.py", line 14 _1 = self.running_mean _2 = self.bias input = ops.quantized.batch_norm(argument_1, self.weight, _2, _1, _0, 1.0000000000000001e-05, 0.44537684321403503, 129) ~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE return input Aborted
There’s no op
quantized::batch_norm but there are
quantized::batch_norm2d/3d. In the working case traced code looks the following way:
def forward(self, argument_1: Tensor) -> Tensor: _0 = self.running_var _1 = self.running_mean _2 = self.bias input = torch.batch_norm(argument_1, self.weight, _2, _1, _0, False, 0.10000000000000001, 1.0000000000000001e-05, True) return input
What is happening in the not working case? Is it wrong substitution while tracing or there’s an issue with