Installing PyTorch with CUDA support on NVidia Jetson TK1

Based on what I’ve tried (discussed in this post and this post over on the NVidia devtalk forums), I’m pretty sure it’s not possible, and I’ve successfully managed to build PyTorch v0.3.1 without CUDA support, but was wondering if anyone here could offer insight on whether PyTorch with CUDA support can be built on the NVidia Jetson TK1.

Jetson TK1 environment:

  • Ubuntu 14.04
  • armv7l architecture (32-bit ARM)
  • CUDA 6.5 (this is the maximum CUDA version supported on the TK1)

Here’s what I’ve tried. First, I set the following environment variables:

  • PATH=/usr/local/cuda/bin:$PATH
  • NO_MKLDNN=1 (since MKL-DNN is incompatible with 32-bit)
  • USE_NCCL=0
  • NO_DISTRIBUTED=1
  • TORCH_CUDA_ARCH_LIST=3.5

I found that PyTorch > 0.3.1 requires libnvrtc which is only part of CUDA >= 7.0, so attempted to build PyTorch 0.3.1, but encountered the following error:

/tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/THC/THCBlas.cu(495):
error: identifier “cublasSgetrsBatched” is undefined

/tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/THC/THCBlas.cu(512):
error: identifier “cublasDgetrsBatched” is undefined

Turns out that the cuBLAS functions cublas<T>getrsBatched were introduced in CUDA 7.0 based on the CUDA 7.0 release notes. Did some digging and found that the earliest version of PyTorch that didn’t reference these functions was v0.1.10. However, attempting to build PyTorch v0.1.10 resulted in the following CMake error stating that CUDA >= 7.0 is required:

[100%] Built target THCUNN
Install the project…
– Install configuration: “Release”
– Installing: /tmp/tmp.CmqAOpt6SC/pytorch/torch/lib/tmp_install/lib/libTHCUNN.so.1
– Installing: /tmp/tmp.CmqAOpt6SC/pytorch/torch/lib/tmp_install/lib/libTHCUNN.so
– Set runtime path of “/tmp/tmp.CmqAOpt6SC/pytorch/torch/lib/tmp_install/lib/libTHCUNN.so.1” to “”
– Installing: /tmp/tmp.CmqAOpt6SC/pytorch/torch/lib/tmp_install/include/THCUNN/THCUNN.h
– Installing: /tmp/tmp.CmqAOpt6SC/pytorch/torch/lib/tmp_install/include/THCUNN/generic/THCUNN.h
– The C compiler identification is GNU 4.8.4
– The CXX compiler identification is GNU 4.8.4
– Check for working C compiler: /usr/bin/cc
– Check for working C compiler: /usr/bin/cc – works
– Detecting C compiler ABI info
– Detecting C compiler ABI info - done
– Detecting C compile features
– Detecting C compile features - done
– Check for working CXX compiler: /usr/bin/c++
– Check for working CXX compiler: /usr/bin/c++ – works
– Detecting CXX compiler ABI info
– Detecting CXX compiler ABI info - done
– Detecting CXX compile features
– Detecting CXX compile features - done
CMake Error at /usr/local/share/cmake-3.5/Modules/FindPackageHandleStandardArgs.cmake:148 (message):
Could NOT find CUDA: Found unsuitable version “6.5”, but required is at
least “7.0” (found /usr/local/cuda)
Call Stack (most recent call first):
/usr/local/share/cmake-3.5/Modules/FindPackageHandleStandardArgs.cmake:386 (_FPHSA_FAILURE_MESSAGE)
/tmp/tmp.CmqAOpt6SC/pytorch/cmake/FindCUDA/FindCUDA.cmake:1013 (find_package_handle_standard_args)
CMakeLists.txt:5 (FIND_PACKAGE)

Even the first tagged release of PyTorch (v0.1.1) seems to mention CUDA 7.0 in the CMake file pytorch/cmake/FindCUDA/FindCUDA.cmake, which leads me to think it’s impossible to build PyTorch with CUDA 6.5 support on the Jetson TK1.

Am I correct or have I missed something?