I have trained u2net model.
I have traced it into torchscript model to run on mobile.
It’s all good. Results are the same as on PC.
But model takes almost 200MB of storage space so I decided to quantize it using static quantization.
I have added Quant and Dequant stubs into model and used them in start of forward() method (Quant) and in end (Dequant).
Then I get the following error:
... return torch.add(src_x, x2)
RuntimeError: Could not run ‘aten::add.Tensor’ with arguments from the ‘QuantizedCPU’ backend. This could be because the operator doesn’t exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. ‘aten::add.Tensor’ is only available for these backends: [CPU, CUDA, MkldnnCPU, SparseCPU, SparseCUDA, Meta, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradNestedTensor, UNKNOWN_TENSOR_TYPE_ID, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].
As already discussed here I added FloatFunctional object into the module class
self.ff = FloatFunctional()
and replaced my addition instruction
return torch.add(src_x, x2)
with new one
return self.ff.add(src_x, x2).
After that trace() passed with no errors and resulting model decreased in size (45MB instead of nearly 200MB).
I just replaced my old model on mobile with this new one and
- Its output is very bad;
- inference time increased from ~7sec to ~18sec.
Code for tracing:
import torch from torch.utils.mobile_optimizer import optimize_for_mobile from lib import U2NET_full model_select = 'checkpoints/checkpoint.pth' checkpoint = torch.load(model_select) model = U2NET_full() model = model.to('cpu') if 'model' in checkpoint: model.load_state_dict(checkpoint['model']) else: model.load_state_dict(checkpoint) model.eval() input = torch.rand(1, 3, 448, 448) backend = "qnnpack" model.qconfig = torch.quantization.get_default_qconfig(backend) torch.backends.quantized.engine = backend model_static_quantized = torch.quantization.prepare(model, inplace=False) model_static_quantized = torch.quantization.convert(model_static_quantized, inplace=False) torchscript_model = torch.jit.trace(model_static_quantized, input) optimized_torchscript_model = optimize_for_mobile(torchscript_model) optimized_torchscript_model.save("optimized_torchscript_model.pt")
Would you suggest any ways how to fix this?