Mobile deployment best practice?

seungjun · September 14, 2020, 7:58am

Hi, my goal is to deploy CNNs (or Conv RNNs) on Android devices with Snapdragon processors. ex) Pixel 4 (snapdragon 855)
I want to make my models run as fast as possible with GPU or DSP.

Currently, I find 2 ways to deploy PyTorch models.

PyTorch mobile

This is a natively supported method of PyTorch and I like it.
But mobile GPU execution is not supported yet.
I see several PRs related to vulkan APIs here
Will mobile GPU/DSP be supported in near future?

I found this article from Qualcomm that they will support PyTorch but I don’t find any follow-up.

PyTorch -> onnx

Pytorch models could be exported to onnx format.
From onnx I can convert the models into DLC format with SNPE SDK.
But it will be more difficult to create an app with such forms.

Are there any other ways I didn’t find? What would be the best way to deploy PyTorch models on mobile devices?
I used TensorFlow lite before but would prefer to use PyTorch if possible as I mainly use PyTorch.

vferrer · September 17, 2020, 11:21am

I don’t know when Pytorch Mobile will support GPU execution. I looking forward to it

Meanwhile, I run on CPU. I follow this recipe to squeeze the maximum performance as possible. I found it that Pytorch is faster than TF Lite and as fast as TF Lite when using XNNPACK experimental backend.

seungjun · September 17, 2020, 11:46am

Thanks for sharing your experience!
I typically work on image regression models that consume quite heavy computation.

While waiting for the official mobile GPU support, I guess your suggestion is the most practical way at the time being.

By the way, I found that GPU & NNAPI acceleration might be possible if we use onnx runtime.

I will try both ways