Hi,
I have been having difficulties getting the basic cmake example working with pytorch, as in https://pytorch.org/tutorials/advanced/cpp_export.html. I have spent about 5 hours adding different flags for CUDA/cuDNN (I am not using GPUs anyway, but it seems like these packages are required and I do have them installed) and messing around with the CMakeLists.txt file. I haven’t been succesful so I ask for help. I am seeing the following log when I run a script make_cmake.sh
(which runs cmake with flags) and then make
:
-- The C compiler identification is GNU 8.2.0
-- The CXX compiler identification is GNU 8.2.0
-- Check for working C compiler: /cm/shared/sw/pkg/devel/gcc/8.2.0/bin/cc
-- Check for working C compiler: /cm/shared/sw/pkg/devel/gcc/8.2.0/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /cm/shared/sw/pkg/devel/gcc/8.2.0/bin/c++
-- Check for working CXX compiler: /cm/shared/sw/pkg/devel/gcc/8.2.0/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA: /cm/shared/sw/pkg/devel/cuda/9.0.176 (found suitable version "9.0", minimum required is "7.0")
-- Caffe2: CUDA detected: 9.0
-- Caffe2: CUDA nvcc is: /cm/shared/sw/pkg/devel/cuda/9.0.176/bin/nvcc
-- Caffe2: CUDA toolkit directory: /cm/shared/sw/pkg/devel/cuda/9.0.176
-- Caffe2: Header version is: 9.0
-- Found CUDNN: /cm/shared/sw/pkg/devel/cudnn/v7.0-cuda-9.0/include
-- Found cuDNN: v7.0.5 (include: /cm/shared/sw/pkg/devel/cudnn/v7.0-cuda-9.0/include, library: /cm/shared/sw/pkg/devel/cudnn/v7.0-cuda-9.0/lib)
-- Automatic GPU detection failed. Building for common architectures.
-- Autodetected CUDA architecture(s): 3.0;3.5;5.0;5.2;6.0;6.1;7.0;7.0+PTX
-- Added CUDA NVCC flags for: -gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_70,code=compute_70
-- Found torch: /mnt/ceph/users/mcranmer/Downloads/libtorch/lib/libtorch.so
-- Configuring done
-- Generating done
-- Build files have been written to: /mnt/ceph/users/mcranmer/.../build
make (note: see updated error below!):
Scanning dependencies of target run_pytorch
make[2]: *** No rule to make target `/cm/shared/sw/pkg/devel/cudnn/v7.0-cuda-9.0/lib', needed by `run_pytorch'. Stop.
make[1]: *** [CMakeFiles/run_pytorch.dir/all] Error 2
make: *** [all] Error 2
Here, the make_cmake.sh file (in the build directory) is as follows (the … is a long directory):
#!/bin/bash
rm CMakeCache.txt
module load cuda/9.0.176 cudnn/v7.0-cuda-9.0 gcc/8.2.0 lib/openblas/0.2.19-haswell slurm openmpi
FLAGS="-DCUDA_TOOLKIT_ROOT_DIR=/cm/shared/sw/pkg/devel/cuda/9.0.176 -DTORCH_LIBRARIES=/mnt/ceph/users/mcranmer/Downloads/libtorch -DCMAKE_INSTALL_PREFIX=/mnt/ceph/users/mcranmer/.../build -DCMAKE_PREFIX_PATH=/mnt/ceph/users/mcranmer/Downloads/libtorch -DCUDA_HOST_COMPILER=/usr/bin/gcc44 -DCUDNN_INCLUDE_DIR=/cm/shared/sw/pkg/devel/cudnn/v7.0-cuda-9.0/include -DCUDNN_LIBRARY=/cm/shared/sw/pkg/devel/cudnn/v7.0-cuda-9.0/lib"
CMAKE=/mnt/ceph/users/mcranmer/Downloads/cmake-3.13.0-rc2-Linux-x86_64/bin/cmake
$CMAKE $FLAGS ..
My CMakeLists.txt file is the standard:
cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(custom_ops)
find_package(Torch REQUIRED)
add_executable(run_pytorch run_pytorch_1d.cpp)
target_link_libraries(run_pytorch "${TORCH_LIBRARIES}")
set_property(TARGET run_pytorch PROPERTY CXX_STANDARD 11)
The code I am attempting to compile (run_pytorch_1d.cpp) is (it should just load a pytorch model and not do anything with it):
#include <torch/script.h> // One-stop header.
#include <cstdlib>
#include <iostream>
#include <memory>
#include "run_pytorch_1d.h"
#define N_FEATURES 13
float run_pytorch_1d_cpp(float *x) {
std::shared_ptr<torch::jit::script::Module> module = torch::jit::load("/mnt/ceph/users/mcranmer/.../model_to_load_from_cpp.pt");
return x[0] * x[0];
}
int main(int argc, const char* argv[]) {
float x[N_FEATURES] = {1};
printf("%f\n", x[0]);
return 0;
}
Any idea what’s going on? Earlier I had the issue of it trying to build some cuda library (libcu…a) instead of using the ones in my installation, but it was looking in the wrong directory. I guess the flags fixed it.