For now, Pytorch keeps dying at the same spot when compiling Onnx and onnx-tesnsorrt, obviously both of which would be required to deploy for production inferencing on the TX2. Latest errors are:
third_party/onnx-tensorrt/CMakeFiles/nvonnxparser.dir/build.make:134: recipe for target ‘third_party/onnx-tensorrt/CMakeFiles/nvonnxparser.dir/onnx2trt_utils.cpp.o’ failed
make[2]: *** [third_party/onnx-tensorrt/CMakeFiles/nvonnxparser.dir/onnx2trt_utils.cpp.o] Error 1
CMakeFiles/Makefile2:1488: recipe for target ‘third_party/onnx-tensorrt/CMakeFiles/nvonnxparser.dir/all’ failed
make[1]: *** [third_party/onnx-tensorrt/CMakeFiles/nvonnxparser.dir/all] Error 2
[ 27%] Built target python_copy_files
Makefile:160: recipe for target ‘all’ failed
make: *** [all] Error 2
Failed to run ‘bash …/tools/build_pytorch_libs.sh --use-cuda --use-nnpack caffe2 libshm gloo c10d THD’
I’d be open to not compiling it all, and simply using an Onnx exported model, but I have no idea how that would fit into my Python production code. After all, the model is only a part of the PyTorch code, I still use Torch for translating image formats, evaluations, etc.
Not sure where to go from here, but I’d really like to stick with this library and figure out the inferencing for edge devices. Combined with fast.AI’s recent v1.0 release, there’s some pretty amazing work that can be done.
It looks like ONNX has some problems with protobuf.
Did you pull from master before trying to rebuild PyTorch?
If so, could you call git submodule update --init --recursive and try to build again?
I ran into similar issues when some submodules weren’t properly updated.
Ok, so I completely wiped it and followed the standard PyTorch install directions from PyTorch versus the nVidia recommended pytorch_jetson_install.sh from Dusty.
The only thing is I added these changes as per recommended by nVidia since NCCL is only for CUDA desktop GPUs:
When I do that, don’t enable TensorRT support, and run the install with Python3 now, it does compile on the fresh install! I do believe I will need the TensorRT support eventually on the TX2, I’ll keep plugging away on that.
For now, however, when trying to import it into a Python shell, I get this:
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/nvidia/pytorch/torch/__init__.py", line 84, in <module>
from torch._C import *
ImportError: No module named 'torch._C'
>>> exit()