I just compiled the real time speech enhancement model from https://github.com/facebookresearch/denoiser
using torchscript and wanted to test if it can run in real-time on Android devices.
Following the official tutorial, I used this code for compilation:
from denoiser.pretrained import dns48
from denoiser.demucs import DemucsStreamer
import torch
from torch.utils.mobile_optimizer import optimize_for_mobile
model = dns48()
model.eval()
streamer = DemucsStreamer(model)
streamer.eval()
streamer.qconfig = torch.quantization.get_default_qconfig("qnnpack")
torch.quantization.prepare(streamer, inplace=True)
streamer = torch.quantization.convert(streamer, inplace=True)
torchscript_model = torch.jit.script(streamer)
optimized_model = optimize_for_mobile(torchscript_model)
optimized_model._save_for_lite_interpreter("denoiser_dns48_quantized.ptl")
And of course I had to slightly modify the denoiser code (add some type hints, etc.) to make it compile. Without quantization, it runs but it is way too slow. After adding the quantization step I got this error:
com.facebook.jni.CppException: Could not run 'quantized::conv1d' with arguments from the 'CPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'quantized::conv1d' is only available for these backends: [QuantizedCPU, BackendSelect, Functionalize, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy].
I am not sure if my backend falls into QuantizedCPU, or if it is really not supported. Is there anything I can do?
So what this is sasying is that somehow the quantized conv1d is getting FP32 tensor as input instead of quantized one. You might wanna take a look at the quantized torchscript graph (do m = torch.jit.load(...), pring(m.graph)) and see what is the input to conv1d
To be honest, I have no clue what’s going on here, the model is pretty complex.
I have also found this list of supported operations https://github.com/pytorch/pytorch, which does suggest that the support for Android is actually extremely limited.