Threading of Model Pytorch Android

I am trying to deploy my model on android device.Model should be fast in android but it is taking time as model is big.
Is model threading option is there while declaring the model interpreter in android as it is available in tensorflow-lite ?

Hello @mohit7
At the moment we do not have this setting, but we are thinking to add this option to control it.
Current threading model is determined by device, the number of threads is about number of BIG cores on the device.
More details about number of threads per device you can find in the code of function caffe2::ThreadPool::defaultThreadPool()

Please write us if you have threadding issues with some particular device.

Hello @mohit7

We just exposed control on global number of threads used by pytorch android, it was landed in master

method org.pytorch.Module#setNumThreads(int numThreads)

(https://github.com/pytorch/pytorch/blob/master/android/pytorch_android/src/main/java/org/pytorch/Module.java#L57)
The latest android nightlies already include them: https://github.com/pytorch/pytorch/tree/master/android#nightly (you might need gradle argument --refresh-dependencies if you already using them)

Module module = Module.load(moduleFileAbsoluteFilePath);
module.setNumThreads(1);

This is new functionality, please report if you find any issues with it.

Thanks @IvanKobzarev it worked.
But you should explicitly put a restriction on no of threads user is setting because after a certain limit instead of decreasing run-time it is increasing.

One more doubt-
Just like in pytorch can we pass as batch of images in case of pytorch android. ?

Thanks,
Mohit Ranawat

Yes, performance degrades when that thread number is more than cpu cores number as more thread switches and thread contention.
But additional capping may introduce some non-transparency for this API, we will think about it.
Our plan is to revise our default thread pool number to be optimal for inference time by default for as much as possible devices.

Vision models is the same, the input shape is N_images * N_channels (e.g. 3) * IMAGE_HEIGHT * IMAGE_WIDTH.
So if you prepare Tensor with N_images > 1 that should work as in desktop version.

But org.pytorch.torchvisio.TensorImageUtils has api how to prepare tensors only with N_images==1, so it needs some additional code to prepare it with N_images > 1.

Do you think that will be useful to have in TensorImageUtils api some helper methods to prepare Tensors for image batches?