I have trained u2net model.
I have traced it into torchscript model to run on mobile.
It’s all good. Results are the same as on PC.
But model takes almost 200MB of storage space so I decided to quantize it using static quantization.
I have added Quant and Dequant stubs into model and used them in start of forward() method (Quant) and in end (Dequant).
Then I get the following error:
... return torch.add(src_x, x2)
RuntimeError: Could not run ‘aten::add.Tensor’ with arguments from the ‘QuantizedCPU’ backend. This could be because the operator doesn’t exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit Internal Login for possible resolutions. ‘aten::add.Tensor’ is only available for these backends: [CPU, CUDA, MkldnnCPU, SparseCPU, SparseCUDA, Meta, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradNestedTensor, UNKNOWN_TENSOR_TYPE_ID, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].
As already discussed here I added FloatFunctional object into the module class
self.ff = FloatFunctional()
and replaced my addition instruction
return torch.add(src_x, x2)
with new one
return self.ff.add(src_x, x2)
.
After that trace() passed with no errors and resulting model decreased in size (45MB instead of nearly 200MB).
I just replaced my old model on mobile with this new one and
- Its output is very bad;
- inference time increased from ~7sec to ~18sec.
Code for tracing:
import torch
from torch.utils.mobile_optimizer import optimize_for_mobile
from lib import U2NET_full
model_select = 'checkpoints/checkpoint.pth'
checkpoint = torch.load(model_select)
model = U2NET_full()
model = model.to('cpu')
if 'model' in checkpoint:
model.load_state_dict(checkpoint['model'])
else:
model.load_state_dict(checkpoint)
model.eval()
input = torch.rand(1, 3, 448, 448)
backend = "qnnpack"
model.qconfig = torch.quantization.get_default_qconfig(backend)
torch.backends.quantized.engine = backend
model_static_quantized = torch.quantization.prepare(model, inplace=False)
model_static_quantized = torch.quantization.convert(model_static_quantized, inplace=False)
torchscript_model = torch.jit.trace(model_static_quantized, input)
optimized_torchscript_model = optimize_for_mobile(torchscript_model)
optimized_torchscript_model.save("optimized_torchscript_model.pt")
Would you suggest any ways how to fix this?