Installation error - pytorch from source

I I need to use Pytorch with open MPI, so I’m having to install it from source. I’m using CentOS 8.

while runnning - python setup.py install i’m running into the following error.

CMake Warning at caffe2/CMakeLists.txt:755 (add_library):
  Cannot generate a safe runtime search path for target torch_cpu because
  files in some directories may conflict with libraries in implicit
  directories:

    runtime library [libgomp.so.1] in /usr/lib/gcc/x86_64-redhat-linux/8 may be hidden by files in:
      /home/aditya/anaconda3/lib

  Some of these libraries may not be found correctly.


-- Generating done
-- Build files have been written to: /home/aditya/pytorch/build
cmake3 --build . --target install --config Release -- -j 1
[2/2038] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Version.cpp.o
FAILED: caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Version.cpp.o 
/usr/bin/g++  -DCPUINFO_SUPPORTED_PLATFORM=1 -DFMT_HEADER_ONLY=1 -DFXDIV_USE_INLINE_ASSEMBLY=0 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNNP_CONVOLUTION_ONLY=0 -DNNP_INFERENCE_ONLY=0 -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_RPC -DUSE_TENSORPIPE -D_FILE_OFFSET_BITS=64 -Dtorch_cpu_EXPORTS -Iaten/src -I../aten/src -I. -I../ -isystem third_party/gloo -isystem ../cmake/../third_party/gloo -isystem ../cmake/../third_party/googletest/googlemock/include -isystem ../cmake/../third_party/googletest/googletest/include -isystem ../third_party/protobuf/src -isystem ../third_party/gemmlowp -isystem ../third_party/neon2sse -isystem ../third_party/XNNPACK/include -I../cmake/../third_party/benchmark/include -isystem ../third_party -isystem ../cmake/../third_party/eigen -isystem /home/aditya/anaconda3/include/python3.8 -isystem /home/aditya/anaconda3/lib/python3.8/site-packages/numpy/core/include -isystem ../cmake/../third_party/pybind11/include -isystem /home/aditya/anaconda3/include -Icaffe2/contrib/aten -I../third_party/onnx -Ithird_party/onnx -I../third_party/foxi -Ithird_party/foxi -isystem ../third_party/ideep/mkl-dnn/include -isystem ../third_party/ideep/include -I../torch/csrc/api -I../torch/csrc/api/include -I../caffe2/aten/src/TH -Icaffe2/aten/src/TH -Icaffe2/aten/src -Icaffe2/../aten/src -Icaffe2/../aten/src/ATen -I../torch/csrc -I../third_party/miniz-2.0.8 -I../third_party/kineto/libkineto/include -I../third_party/kineto/libkineto/src -I../aten/src/TH -I../aten/../third_party/catch/single_include -I../aten/src/ATen/.. -Icaffe2/aten/src/ATen -I../caffe2/core/nomnigraph/include -isystem include -I../third_party/FXdiv/include -I../c10/.. -Ithird_party/ideep/mkl-dnn/include -I../third_party/ideep/mkl-dnn/src/../include -I../third_party/pthreadpool/include -I../third_party/cpuinfo/include -I../third_party/QNNPACK/include -I../aten/src/ATen/native/quantized/cpu/qnnpack/include -I../aten/src/ATen/native/quantized/cpu/qnnpack/src -I../third_party/cpuinfo/deps/clog/include -I../third_party/NNPACK/include -I../third_party/fbgemm/include -I../third_party/fbgemm -I../third_party/fbgemm/third_party/asmjit/src -I../third_party/FP16/include -I../third_party/tensorpipe -Ithird_party/tensorpipe -I../third_party/tensorpipe/third_party/libnop/include -I../third_party/fmt/include -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow -DHAVE_AVX_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -DNDEBUG -DNDEBUG -fPIC   -DCAFFE2_USE_GLOO -DHAVE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wno-write-strings -Wno-unknown-pragmas -Wno-missing-braces -Wno-maybe-uninitialized -fvisibility=hidden -O2 -fopenmp -DCAFFE2_BUILD_MAIN_LIB -pthread -DASMJIT_STATIC -std=gnu++14 -MD -MT caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Version.cpp.o -MF caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Version.cpp.o.d -o caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Version.cpp.o -c ../aten/src/ATen/Version.cpp
../aten/src/ATen/Version.cpp: In function ‘std::__cxx11::string at::get_mkldnn_version()’:
../aten/src/ATen/Version.cpp:45:13: error: ‘mkldnn_version_t’ does not name a type; did you mean ‘dnnl_version_t’?
       const mkldnn_version_t* ver = mkldnn_version();
             ^~~~~~~~~~~~~~~~
             dnnl_version_t
../aten/src/ATen/Version.cpp:46:37: error: ‘ver’ was not declared in this scope
       ss << "Intel(R) MKL-DNN v" << ver->major << "." << ver->minor << "." << ver->patch
                                     ^~~
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "setup.py", line 864, in <module>
    build_deps()
  File "setup.py", line 354, in build_deps
    build_caffe2(version=version,
  File "/home/aditya/pytorch/tools/build_pytorch_libs.py", line 58, in build_caffe2
    cmake.build(my_env)
  File "/home/aditya/pytorch/tools/setup_helpers/cmake.py", line 345, in build
    self.run(build_args, my_env)
  File "/home/aditya/pytorch/tools/setup_helpers/cmake.py", line 140, in run
    check_call(command, cwd=self.build_dir, env=env)
  File "/home/aditya/anaconda3/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake3', '--build', '.', '--target', 'install', '--config', 'Release', '--', '-j', '1']' returned non-zero exit status 1.

This seems to be general build issue. cc @seemethere

@ptrblck have you seen similar error before? If not, I can create an issue on github to report this failure.

No, I haven’t seen this MKL issue before.
Based on the code usage, it seems that MKL might either not be available, while AT_MKL_ENABLED() is set (lines of code) or maybe there is some kind of mix between MKL/OneAPI on the system.

@bogoman are you seeing the same issue after a python setup.py clean and a new git submodule update --init --recursive?

1 Like

Hey!
I uninstalled conda mkl, and installed the pip libraries. It worked just fine.