Unable to run the benchmark with NNAPI

Hi, I’m trying to run the benchmark with NNAPI but keep blocked. Any feedbacks or help would be so appreciated.

Following the instructions here: (Prototype) Convert MobileNetV2 to NNAPI — PyTorch Tutorials 1.9.0+cu102 documentation
I’ve prepared the model with the versions below and then set up my benchmark.

torch                        1.9.0.dev20210427+cpu
torchvision               0.10.0.dev20210427+cpu

When I run

./speed_benchmark_torch --pthreadpool_size=1 --model=mobilenetv2-quant_full-nnapi.pt --use_bundled_input=0 --warmup=5 --iter=200

I got an error message: PytorchStreamReader failed locating file bytecode.pkl: file not found ()
So based on the advices from: (beta) Efficient mobile interpreter in Android and iOS — PyTorch Tutorials 1.9.0+cu102 documentation

I’ve updated my model preparation script

# Save both models.
    model.save(output_dir_path / ("mobilenetv2-quant_{}-cpu.pt".format(quantize_mode)))
    model._save_for_lite_interpreter("mobilenetv2-quant-cpu.ptl")
    nnapi_model.save(output_dir_path / ("mobilenetv2-quant_{}-nnapi.pt".format(quantize_mode)))
    nnapi_model._save_for_lite_interpreter("mobilenetv2-quant-nnapi.ptl")

The cpu.ptl file works with benchmark this time, while nnapi.ptl ends up with

Starting benchmark.
Running warmup runs.
terminating with uncaught exception of type c10::Error: [enforce fail at nnapi_model_loader.cpp:171] result == ANEURALNETWORKS_NO_ERROR. NNAPI returned error: 4

Aborted

I’ve upgraded my versions for another try

torch                        1.9.0.
torchvision               0.10.0.

it failed with the same issue

Starting benchmark.
Running warmup runs.
terminating with uncaught exception of type c10::Error: [enforce fail at nnapi_model_loader.cpp:171] result == ANEURALNETWORKS_NO_ERROR. NNAPI returned error: 4

I was wondering is there anything I missed? Any suggestions would be appreciated.
Thanks.

cc: @David_Reiss, @axitkhurana

Sorry for the delay here @JingL, I was able to run things with

pip install torchvision==0.9.1

Update to nightly torch:

pip install --upgrade --pre --find-links https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html torch==1.10.0.dev20210715+cpu

# ./speed_benchmark_torch --pthreadpool_size=1 --model=mobilenetv2-none-nnapi.ptl  --use_bundled_input=0 --warmup=5 --iter=200                                                                           
Starting benchmark.
Running warmup runs.
Main runs.
Main run finished. Microseconds per iter: 137647. Iters per second: 7.26494

Changes in converter script:

nnapi_model._save_for_lite_interpreter(“mobilenetv2-{}-nnapi.ptl”.format(quantize_mode))

Please let me know if this works for you