Worse performance using PyTorch Mobile than on GPU/CPU

I have a custom model that is a variation on YOLOv3, to test the results, I have asserted that the inputTensor on device is the same as that I am loading on the computer. The output (which has detections and classifications) is giving near identical object and class confidences. However, the locations (x,y,w,h) are slightly off. Is this expected behaviour? Do you know if there is anything in particular I should investigate in my model or the trace of my model?

Hello @IsaacBerman,

Do you use completely the same model on mobile and desktop, without any quantization on mobile?

If you already debugged it, maybe you have some ideas which operator/layer is producing different results?

Could you please dump all operators that your model use, here are steps how to do it:

First we need dump_operator_names executable which can be built with command:

BUILD_BINARIES=1 python setup.py build

After succesfull build you can find it in path like ./build/lib.linux-x86_64-3.7/torch/bin/dump_operator_names.

It depends on libcaffe2_observers.so, libtorch.so, libc10.so, so they should be either located in the same folder or be installed on the system.


mkdir tmp
cp ./build/lib.linux-x86_64-3.7/torch/bin/dumpop_operator_names tmp/
cp ./build/lib.linux-x86_64-3.7/torch/lib/libcaffe2_observers.so tmp
cp ./build/lib.linux-x86_64-3.7/torch/lib/libtorch.so tmp
cp ./build/lib.linux-x86_64-3.7/torch/lib/libc10.so tmp

After that you can run it, specifing model and output file.

./dump_operator_names --model=model.pt --output=model_ops.yaml

Hi Ivan,

Thanks for your response. I ended up solving the error. It turned out it was a problem with the tracing – see here: Torch.jit.trace() only works on example input?