Slow inference with U-2-Net on iOS

I’m using U-2-Net on mobile.
I converted the model to TorchScript, and used it succesfully on Android. The inference takes about 1 sec.
Now, if I use the same model on iOS, the inference is about half a minute. I disable autograd, and run the model in inference mode.
Any idea what could cause such a big performance difference using the same model?

(I use 1.6.0 nightly build on Android, and 1.5.0 production build on iOS)

Android 1.6.0 nightly is using XNNPACK, whereas iOS 1.5.0 is not. We’ll be enabling XNNPACK on iOS in 1.6.0 release.

Ok, thank you! Meanwhile we managed to speed up the iOS inference. The thing is that on Android we scripted to model by calling jit.script(), and we tried to use this exact same model on iOS, which was slow. I tried another approach for converting to TorchScript by tracing the model’s execution with jit.trace, and for some reason the “traced model” performance on iOS was on par with the Android inference. Can it be also related to XNNPACK somehow?

Cool, it’s nice to hear the issue has been resolved. Short answer, no. The XNNPACK is our new computation kernels that has nothing to do with scripting. Yea, tracing is supposed to be faster.

1 Like