Is it possible to do operator profiling within the naked binary for PyTorch Mobile in Android?

Hello all,
The naked binary for benchmarking pytorch mobile models (Pytorch Mobile Performance Recipes — PyTorch Tutorials 1.10.1+cu102 documentation) provides a good way to know the model performance without deploying a real App.

I was wondering whether it can do the operator profiling like PyTorch Profiler (Introducing PyTorch Profiler - the new and improved performance tool | PyTorch)?

So that maybe we can know which operator is the bottleneck of the whole model?

Thank you!

I think we do have op profiling. cc @kimishpatel.

Thank you @Linbin for the suggestion!

Hi @kimishpatel, could you give me some hints on how to do the operator profiling in the naked PyTorch Mobile binary benchmark like PyTorch Profiler?

I also tried some other solutions like Facebook AI Performance Evaluation Platform, which provides operator profiling for caffe2 models, but it seems that it has been out-of-date for the current PyTorch repo.

Thank you in advance for your time and efforts!

Hi @Linbin and @kimishpatel
Thank you for your help!
Is this the source file for the generated speed_benchmark_torch binary?
If so, it seems that op profiling has not been supported yet.
Do you have any plan to support the op profiling? Or could you share some hints with me on how to do that myself?

Thank you!

Are you running with BUILD_LITE_INTERPRETER? Do you know?

Hi @kimishpatel,

Thank you for reply!

The following are the steps I used to build it

cd pytorch

export ANDROID_ABI=arm64-v8a # I am using Pixel3 for it
export ANDROID_NDK=/path/to/Android/Sdk/ndk/21.3.6528147/

rm -rf build_android

-DCMAKE_PREFIX_PATH=$(python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())') \
-DPYTHON_EXECUTABLE=$(python -c 'import sys; print(sys.executable)')

Is it the key to do operator profiling? Or any other steps are needed to do it?

Thank you!