USE_NCCL is ON, But Private Dependencies does not include nccl

Traceback (most recent call last):
File “test_dist.py”, line 5, in
dist.init_process_group(backend=“NCCL”, init_method=“file:///distributed_test”, world_size=2, rank=0)
File “/home/simon/anaconda3/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py”, line 288, in init_process_group
raise RuntimeError("Distributed package doesn’t have NCCL "
RuntimeError: Distributed package doesn’t have NCCL built in

I install pytorch from the source v1.0rc1, getting the config summary as follows:
USE_NCCL is On, Private Dependencies does not include nccl, nccl is not built-in.

  • PyTorch Version:v1.0rc1
  • OS:Ubuntu18.04.1
  • How you installed PyTorch:source
  • Build command you used:python setup.py install
  • Python version:3.7
  • CUDA/cuDNN version:CUDA10.0, CuDNN7.3.1
  • GPU models and configuration:GTX1080


– ******** Summary ********
– General:
– CMake version : 3.12.2
– CMake command : /home/simon/anaconda3/bin/cmake
– System : Linux
– C++ compiler : /usr/bin/c++
– C++ compiler version : 7.3.0
– BLAS : MKL
– CXX flags : -msse3 -msse4.1 -msse4.2 --std=c++11 -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -Wno-stringop-overflow
– Build type : Release
– Compile definitions : ONNX_NAMESPACE=onnx_torch;MAGMA_V2;USE_C11_ATOMICS=1;TH_BLAS_MKL;HAVE_MMAP=1;_FILE_OFFSET_BITS=64;HAVE_SHM_OPEN=1;HAVE_SHM_UNLINK=1;HAVE_MALLOC_USABLE_SIZE=1
– CMAKE_PREFIX_PATH : /home/simon/anaconda3/lib/python3.7/site-packages
– CMAKE_INSTALL_PREFIX : /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install

– TORCH_VERSION : 1.0.0
– CAFFE2_VERSION : 1.0.0
– BUILD_ATEN_MOBILE : OFF
– BUILD_BINARY : OFF
– BUILD_CUSTOM_PROTOBUF : ON
– Link local protobuf : ON
– BUILD_DOCS : OFF
– BUILD_PYTHON : ON
– Python version : 3.7
– Python executable : /home/simon/anaconda3/bin/python
– Pythonlibs version : 3.7.0
– Python library : /home/simon/anaconda3/lib/python3.7
– Python includes : /home/simon/anaconda3/include/python3.7m
– Python site-packages: lib/python3.7/site-packages
– BUILD_CAFFE2_OPS : ON
– BUILD_SHARED_LIBS : ON
– BUILD_TEST : ON
– USE_ASAN : OFF
– USE_CUDA : 1
– CUDA static link : 0
– USE_CUDNN : ON
– CUDA version : 10.0
– cuDNN version : 7.3.1
– CUDA root directory : /usr/local/cuda
– CUDA library : /usr/lib/x86_64-linux-gnu/libcuda.so
– cudart library : /usr/local/cuda/lib64/libcudart_static.a;-pthread;dl;/usr/lib/x86_64-linux-gnu/librt.so
– cublas library : /usr/local/cuda/lib64/libcublas.so
– cufft library : /usr/local/cuda/lib64/libcufft.so
– curand library : /usr/local/cuda/lib64/libcurand.so
– cuDNN library : /usr/local/cuda/lib64/libcudnn.so.7
– nvrtc : /usr/local/cuda/lib64/libnvrtc.so
– CUDA include path : /usr/local/cuda/include
– NVCC executable : /usr/local/cuda/bin/nvcc
– CUDA host compiler : /usr/bin/cc
– USE_TENSORRT : OFF
– USE_ROCM : OFF
– USE_EIGEN_FOR_BLAS :
– USE_FFMPEG : OFF
– USE_GFLAGS : OFF
– USE_GLOG : OFF
– USE_LEVELDB : OFF
– USE_LITE_PROTO : OFF
– USE_LMDB : OFF
– USE_METAL : OFF
– USE_MKL :
– USE_MOBILE_OPENGL : OFF
– USE_NCCL : ON
– USE_SYSTEM_NCCL : ON
– USE_NERVANA_GPU : OFF
– USE_NNPACK : 1
– USE_OBSERVERS : ON
– USE_OPENCL : OFF
– USE_OPENCV : OFF
– USE_OPENMP : OFF
– USE_PROF : OFF
– USE_REDIS : OFF
– USE_ROCKSDB : OFF
– USE_ZMQ : OFF
– USE_DISTRIBUTED : ON
– USE_MPI : OFF
– USE_GLOO : ON
– USE_GLOO_IBVERBS : OFF
– Public Dependencies : Threads::Threads;caffe2::mkl
– Private Dependencies : nnpack;cpuinfo;gloo;gloo;aten_op_header_gen;onnxifi_loader;rt;gcc_s;gcc;dl
– Configuring done

Could you please share full build output.

Hi, Deepali Patel,
Sorry for that the outputs is too long.
What’s more, can you leave a email address, so I can send you the full build output?
Thank you.

NCCL related build ouputs:
Building wheel torch-1.0.0a0
~/Desktop/pytorch-scripts-1.0rc1/pytorch/third_party/build/nccl ~/Desktop/pytorch-scripts-1.0rc1/pytorch/third_party ~/Desktop/pytorch-scripts-1.0rc1/pytorch/build
– Set NVCC_GENCODE for building NCCL: -gencode=arch=compute_30,code=sm_30;-gencode=arch=compute_35,code=sm_35;-gencode=arch=compute_50,code=sm_50;-gencode=arch=compute_52,code=sm_52;-gencode=arch=compute_60,code=sm_60;-gencode=arch=compute_61,code=sm_61;-gencode=arch=compute_70,code=sm_70;-gencode=arch=compute_70,code=compute_70
– Build files have been written to: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/third_party/build/nccl
Scanning dependencies of target nccl
[100%] Generating lib/libnccl.so
Grabbing src/nccl.h > /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/third_party/build/nccl/include/nccl.h
Compiling src/libwrap.cu > /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/third_party/build/nccl/obj/libwrap.o
Compiling src/core.cu > /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/third_party/build/nccl/obj/core.o
Compiling src/all_gather.cu > /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/third_party/build/nccl/obj/all_gather.o
Compiling src/all_reduce.cu > /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/third_party/build/nccl/obj/all_reduce.o
Compiling src/broadcast.cu > /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/third_party/build/nccl/obj/broadcast.o
Compiling src/reduce.cu > /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/third_party/build/nccl/obj/reduce.o
Compiling src/reduce_scatter.cu > /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/third_party/build/nccl/obj/reduce_scatter.o
Linking libnccl.so.1.3.5 > /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/third_party/build/nccl/lib/libnccl.so.1.3.5
Archiving libnccl_static.a > /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/third_party/build/nccl/lib/libnccl_static.a
[100%] Built target nccl
– Installing: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include/nccl.h
– Found NCCL: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include
– Determining NCCL version from the header file: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include/nccl.h
– Found NCCL (include: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include, library: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/lib/libnccl.so)
– Determining NCCL version from the header file: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include/nccl.h
– Found NCCL (include: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include, library: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/lib/libnccl.so)
[ 10%] Building NVCC (Device) object third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir/nccl/gloo_cuda_generated_nccl.cu.o
[ 79%] Building CXX object caffe2/CMakeFiles/caffe2_gpu.dir/contrib/nccl/cuda_nccl_gpu.cc.o
[ 79%] Building CXX object caffe2/CMakeFiles/caffe2_gpu.dir/contrib/nccl/cuda_nccl_op_gpu.cc.o
– Installing: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include/caffe2/contrib/nccl
– Installing: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include/caffe2/contrib/nccl/cuda_nccl_gpu.h
– Installing: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/lib/python3.6/site-packages/caffe2/contrib/nccl
– Installing: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/lib/python3.6/site-packages/caffe2/contrib/nccl/init.py
– Installing: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/lib/python3.6/site-packages/caffe2/contrib/nccl/nccl_ops_test.py
– Installing: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/lib/python3.6/site-packages/caffe2/contrib/nccl/CMakeFiles
– Installing: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/lib/python3.6/site-packages/caffe2/CMakeFiles/caffe2_gpu.dir/contrib/nccl
– Installing: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include/torch/csrc/cuda/python_nccl.h
– Installing: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include/torch/csrc/cuda/nccl.h
– Installing: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include/gloo/cuda_collectives_nccl.h
– Found NCCL: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include
– Determining NCCL version from the header file: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include/nccl.h
– Found NCCL (include: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include, library: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/lib/libnccl.so)
– NCCL_LIBRARIES: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/lib/libnccl.so
– NCCL_INCLUDE_DIRS: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include
– Found NCCL, but the NCCL version is either not 2+ or not determinable, will not compile with NCCL distributed backend
– Found NCCL: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include
– Determining NCCL version from the header file: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include/nccl.h
– Found NCCL (include: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/include, library: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/lib/libnccl.so)
– NCCL_LIBRARIES: /home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/torch/lib/tmp_install/lib/libnccl.so
– Found NCCL, but the NCCL version is either not 2+ or not determinable, will not compile with NCCL distributed backend

When I git clone nccl branch v2.3.7-1 instead, error occurs when compiling:
Desktop/pytorch-scripts-1.0rc1/pytorch/cmake/public/utils.cmake -DNUM_JOBS=8
CMake Error: The source directory “/home/simon/Desktop/pytorch-scripts-1.0rc1/pytorch/third_party/nccl” does not appear to contain CMakeLists.txt.
Specify --help for usage, or press the help button on the CMake GUI.

Did you find the solution for this?