Encounter issues when compiling pytorch from sources for NCCL backends


I downloaded the most recent pytorch codes and tried to compile it from the source. Should I specify
WITH_SYSTEM_NCCL=0 (along with WITH_NCCL=1 WITH_DISTRIBUTED=1, WITH_CUDA=1) when I invoke “python setup.py build develop” in order to use the new nccl APIs? It seems the version of generated libnccl.so is 1.3.5, not version 2+, so THD was compiled without nccl support.

I also downloaded NCCL2 from Nvidia website, tried WITH_SYSTEM_NCCL=1, and specified NCCL_INCLUDE_DIR, NCCL_LIB_DIR, NCCL_ROOT_DIR. But THD was compiled without nccl support either. Can anyone help with this? Thanks a lot! – I have been stuck here for a few days.


Did you figure it out? I’m also interested.