Sucessfull Pytorch Mobile Build Strange Error at Runtime

I convert my trained model using torch.jit.trace to TorchScript, later by loading the saved model into my android project I managed to run an inference of my model into my arm64-v8a based device. As everything else in my life this one neither isn’t tend to go smoothly and I got the following error:

    Traceback of TorchScript, original code (most recent call last):
    /home/parano/Desktop/ParaStab/src/models/CalibModelPinHole.py(182): forward
    /home/parano/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(860): _slow_forward
    /home/parano/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(887): _call_impl
    /home/parano/.local/lib/python3.8/site-packages/torch/jit/_trace.py(934): trace_module
    /home/parano/.local/lib/python3.8/site-packages/torch/jit/_trace.py(733): trace
    /home/parano/Desktop/ParaStab/src/ConvertJIT.py(29): <module>
    **RuntimeError: inverse: LAPACK library not found in compilation**
    
        at android.app.ActivityThread.performResumeActivity(ActivityThread.java:4789)
        at android.app.ActivityThread.handleResumeActivity(ActivityThread.java:4832)
        at android.app.servertransaction.ResumeActivityItem.execute(ResumeActivityItem.java:52)
        at android.app.servertransaction.TransactionExecutor.executeLifecycleState(TransactionExecutor.java:190)
        at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:105)
        at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2386)

Unfortunately part of my model used inverse matrix operation which is depicted by the Runtime error. I found out that I need to build the pytorch with OpenBLAS/LAPACK flag.
Here is my command which is used to initiate the build process:

  USE_LAPACK=1 BUILD_LITE_INTERPRETER=0 BUILD_MOBILE_AUTOGRAD=ON BUILD_JNI=ON  ./scripts/build_pytorch_android.sh arm64-v8a

And here is CMake’s Summary

-- 
-- ******** Summary ********
-- General:
--   CMake version         : 3.16.3
--   CMake command         : /usr/bin/cmake
--   System                : Android
--   C++ compiler          : /home/parano/Android/Sdk/ndk/21.3.6528147/toolchains/llvm/prebuilt/linux-x86_64/bin/clang++
--   C++ compiler id       : Clang
--   C++ compiler version  : 9.0
--   Using ccache if found : ON
--   Found ccache          : /usr/bin/ccache
--   CXX flags             : -g -DANDROID -fdata-sections -ffunction-sections -funwind-tables -fstack-protector-strong -no-canonical-prefixes -D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -frtti -fexceptions  -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DUSE_VULKAN_WRAPPER -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN -DUSE_VULKAN_API -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -fcolor-diagnostics -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -g0
--   Build type            : Release
--   Compile definitions   : 
--   CMAKE_PREFIX_PATH     : /usr/lib/python3.8/site-packages;/home/parano/Android/Sdk/ndk/21.3.6528147/toolchains/llvm/prebuilt/linux-x86_64
--   CMAKE_INSTALL_PREFIX  : /mnt/media/Linux/pytorch/pytorch/build_android_arm64-v8a/install
--   USE_GOLD_LINKER       : OFF
-- 
--   TORCH_VERSION         : 1.10.0
--   CAFFE2_VERSION        : 1.10.0
--   BUILD_CAFFE2          : ON
--   BUILD_CAFFE2_OPS      : OFF
--   BUILD_CAFFE2_MOBILE   : OFF
--   BUILD_STATIC_RUNTIME_BENCHMARK: OFF
--   BUILD_TENSOREXPR_BENCHMARK: OFF
--   BUILD_BINARY          : OFF
--   BUILD_CUSTOM_PROTOBUF : OFF
--     Protobuf compiler   : 
--     Protobuf includes   : 
--     Protobuf libraries  : 
--   BUILD_DOCS            : OFF
--   BUILD_PYTHON          : OFF
--   BUILD_SHARED_LIBS     : OFF
--   CAFFE2_USE_MSVC_STATIC_RUNTIME     : ON
--   BUILD_TEST            : OFF
--   BUILD_JNI             : OFF
--   BUILD_MOBILE_AUTOGRAD : OFF
--   BUILD_LITE_INTERPRETER: OFF
--   INTERN_BUILD_MOBILE   : ON
--   USE_BLAS              : 1
--     BLAS                : 
--   USE_LAPACK            : 0
--   USE_ASAN              : OFF
--   USE_CPP_CODE_COVERAGE : OFF
--   USE_CUDA              : OFF
--   USE_ROCM              : OFF
--   USE_EIGEN_FOR_BLAS    : ON
--   USE_FBGEMM            : OFF
--     USE_FAKELOWP          : OFF
--   USE_KINETO            : OFF
--   USE_FFMPEG            : OFF
--   USE_GFLAGS            : OFF
--   USE_GLOG              : OFF
--   USE_LEVELDB           : OFF
--   USE_LITE_PROTO        : OFF
--   USE_LMDB              : OFF
--   USE_METAL             : OFF
--   USE_PYTORCH_METAL     : OFF
--   USE_FFTW              : OFF
--   USE_MKL               : 
--   USE_MKLDNN            : OFF
--   USE_NCCL              : OFF
--   USE_NNPACK            : ON
--   USE_NUMPY             : ON
--   USE_OBSERVERS         : OFF
--   USE_OPENCL            : OFF
--   USE_OPENCV            : OFF
--   USE_OPENMP            : OFF
--   USE_TBB               : OFF
--   USE_VULKAN            : ON
--     USE_VULKAN_FP16_INFERENCE    : OFF
--     USE_VULKAN_RELAXED_PRECISION : OFF
--     USE_VULKAN_SHADERC_RUNTIME   : OFF
--   USE_PROF              : OFF
--   USE_QNNPACK           : OFF
--   USE_PYTORCH_QNNPACK   : ON
--   USE_REDIS             : OFF
--   USE_ROCKSDB           : OFF
--   USE_ZMQ               : OFF
--   USE_DISTRIBUTED       : OFF
--   USE_DEPLOY           : OFF
--   Public Dependencies  : Threads::Threads
--   Private Dependencies : eigen_blas;pthreadpool;cpuinfo;pytorch_qnnpack;nnpack;XNNPACK;VulkanWrapper;fp16;log;fmt::fmt-header-only;dl
-- Configuring done
-- Generating done
-- Build files have been written to: /mnt/media/Linux/pytorch/pytorch/build_android_arm64-v8a

And finally the successful build message:

Build multiple targets pytorch_jni_arm64-v8a fbjni_arm64-v8a
ninja: Entering directory `/mnt/media/Linux/pytorch/pytorch/android/pytorch_android/.cxx/cmake/release/arm64-v8a'
[1/20] Building CXX object CMakeFiles/VulkanWrapper.dir/home/parano/Android/Sdk/ndk/21.1.6352462/sources/third_party/vulkan/src/common/vulkan_wrapper.cpp.o
[2/20] Linking CXX static library libVulkanWrapper.a
[3/20] Building CXX object fbjni/arm64-v8a/CMakeFiles/fbjni.dir/cxx/fbjni/ReadableByteChannel.cpp.o
[4/20] Building CXX object fbjni/arm64-v8a/CMakeFiles/fbjni.dir/cxx/fbjni/ByteBuffer.cpp.o
[5/20] Building CXX object fbjni/arm64-v8a/CMakeFiles/fbjni.dir/cxx/fbjni/detail/References.cpp.o
[6/20] Building CXX object fbjni/arm64-v8a/CMakeFiles/fbjni.dir/cxx/fbjni/OnLoad.cpp.o
[7/20] Building CXX object fbjni/arm64-v8a/CMakeFiles/fbjni.dir/cxx/lyra/cxa_throw.cpp.o
[8/20] Building CXX object fbjni/arm64-v8a/CMakeFiles/fbjni.dir/cxx/fbjni/detail/Environment.cpp.o
[9/20] Building CXX object fbjni/arm64-v8a/CMakeFiles/fbjni.dir/cxx/fbjni/fbjni.cpp.o
[10/20] Building CXX object fbjni/arm64-v8a/CMakeFiles/fbjni.dir/cxx/fbjni/detail/Meta.cpp.o
[11/20] Building CXX object fbjni/arm64-v8a/CMakeFiles/fbjni.dir/cxx/fbjni/detail/utf8.cpp.o
[12/20] Building CXX object fbjni/arm64-v8a/CMakeFiles/fbjni.dir/cxx/fbjni/detail/Hybrid.cpp.o
[13/20] Building CXX object fbjni/arm64-v8a/CMakeFiles/fbjni.dir/cxx/lyra/lyra_breakpad.cpp.o
[14/20] Building CXX object fbjni/arm64-v8a/CMakeFiles/fbjni.dir/cxx/lyra/lyra_exceptions.cpp.o
[15/20] Building CXX object fbjni/arm64-v8a/CMakeFiles/fbjni.dir/cxx/lyra/lyra.cpp.o
[16/20] Building CXX object fbjni/arm64-v8a/CMakeFiles/fbjni.dir/cxx/fbjni/detail/Exceptions.cpp.o
[17/20] Linking CXX shared library ../../../../build/intermediates/cmake/release/obj/arm64-v8a/libfbjni.so
[18/20] Building CXX object CMakeFiles/pytorch_jni.dir/src/main/cpp/pytorch_jni_jit.cpp.o
[19/20] Building CXX object CMakeFiles/pytorch_jni.dir/src/main/cpp/pytorch_jni_common.cpp.o
[20/20] Linking CXX shared library ../../../../build/intermediates/cmake/release/obj/arm64-v8a/libpytorch_jni.so

> Task :pytorch_android_torchvision:externalNativeBuildRelease
Build pytorch_vision_jni_arm64-v8a
ninja: Entering directory `/mnt/media/Linux/pytorch/pytorch/android/pytorch_android_torchvision/.cxx/cmake/release/arm64-v8a'
ninja: no work to do.

Deprecated Gradle features were used in this build, making it incompatible with Gradle 7.0.
Use '--warning-mode all' to show the individual deprecation warnings.
See https://docs.gradle.org/6.8.3/userguide/command_line_interface.html#sec:command_line_warnings

BUILD SUCCESSFUL in 1m 50s
147 actionable tasks: 34 executed, 113 up-to-date
+ xargs ls -lah
+ find /mnt/media/Linux/pytorch/pytorch/android -type f -name '*aar'
-rwxrwxrwx 1 root root  15M Aug  3 23:16 /mnt/media/Linux/pytorch/pytorch/android/pytorch_android/build/outputs/aar/pytorch_android-release.aar
-rwxrwxrwx 1 root root 9.1K Aug  3 21:15 /mnt/media/Linux/pytorch/pytorch/android/pytorch_android_torchvision/build/outputs/aar/pytorch_android_torchvision-release.aar

The problem here is now when I run the same model, only with one difference which instead of implementation 'org.pytorch:pytorch_android:1.9.0' I used the generated .aar file I get the following error:

    Process: com.parano.parastab, PID: 24066
    **java.lang.RuntimeException: Unable to resume activity {com.parano.parastab/com.parano.parastab.Camera2CaptureActivity}: java.lang.NullPointerException: Attempt to invoke virtual method 'org.pytorch.IValue org.pytorch.Module.forward(org.pytorch.IValue[])' on a null object reference**
        at android.app.ActivityThread.performResumeActivity(ActivityThread.java:4789)
        at android.app.ActivityThread.handleResumeActivity(ActivityThread.java:4832)
        at android.app.servertransaction.ResumeActivityItem.execute(ResumeActivityItem.java:52)
        at android.app.servertransaction.TransactionExecutor.executeLifecycleState(TransactionExecutor.java:190)
        at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:105)
        at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2386)

Does anyone have any idea?

1 Like

This seem to suggest you are calling forward on object that is not initialized?

Similar to this one? android - java.lang.NullPointerException: Attempt to invoke virtual method on a null object reference - Stack Overflow.

I think the object is initialized since as I mentioned in the first part, the forward method is tried to apply some operation (inverse) which was not built on the canonical pytorch builds, the code is exactly the same as the first trial except I load the custom build .aar files. What’s your suggestions to double check? If the object isn’t null is it justified to put aside this matter?

Update:
After investigating further your suggestion, I found out when I use the successfully built library the Module not working as expected to it’s canonical built counter part (implementation 'org.pytorch:pytorch_android:1.9.0')

Here I put together some code to cover this issue.

public class CameraModel  {
    private static final String TAG = "CameraModel" ;
    private static final String FORMAT_INFERENCE_TIME = "Inference Time:.0%dns" ;

    private Module mModule ;
    public final String AssetName = "parastab_model.pt" ;

    private final Context mContext;
    private final TextView mInferenceTimeTextView ;

    public CameraModel(Context context, TextView inferencetimetextview) {
        mContext = context ;
        mInferenceTimeTextView = inferencetimetextview ;

        try {
            String assetFile = assetFilePath(mContext, AssetName);
            mModule = Module.load(assetFile);
        } catch (Exception e) {
            Log.e(TAG, "Something went wrong loading module") ;
            e.printStackTrace();
        }
    }

 // Following code is mainly from the PyTorch's Android Demo App
    public static String assetFilePath(Context context, String assetName) {
        File file = new File(context.getFilesDir(), assetName);
        if (file.exists() && file.length() > 0) {
            return file.getAbsolutePath();
        }

        try (InputStream is = context.getAssets().open(assetName)) {
            try (OutputStream os = new FileOutputStream(file)) {
                byte[] buffer = new byte[4 * 1024];
                int read;
                while ((read = is.read(buffer)) != -1) {
                    os.write(buffer, 0, read);
                }
                os.flush();
            }
            return file.getAbsolutePath();
        } catch (IOException e) {
            Log.e(TAG, "Error process asset " + assetName + " to file path");
        }
        return null;
    }

When I used the custom build library in which is loaded via the following code:

// May be something is wrong with how I loaded the aars ?
implementation fileTree(dir: '/path/to/pytorch_android/', include: ['*.aar', '*.jar'], exclude: [])

I get the following error:

2021-08-04 21:58:52.905 27364-27364/? E/CameraModel: Something went wrong loading module
2021-08-04 21:58:52.906 27364-27364/? W/System.err:     at com.parano.parastab.CameraModel.<init>(CameraModel.java:41)
2021-08-04 21:58:52.929 27364-27364/? E/AndroidRuntime: FATAL EXCEPTION: main
    Process: com.parano.parastab, PID: 27364

Is the Module isn’t compiled correctly(beside the fact that it shows successful build)? Can you check the CMake summary and tell me if I miss any required flags at compilation time?

Any update on this matter? Guys my team is in a bit of a hurry …