Cannot find NCCL libnccl-net.so file

NCCL cannot find libnccl-net.so and running nccl --version does not give any output, what could be the reason.

CUDA VERSİON = 12.0
PYTHON VERSİON = 3.8
TORCH VERSİON = 2.0.1

You might want to install it in your setup then. More information about net plugins can be found here.

nccl is not a binary command so unsure what exactly you are trying to run. NCCL is a library used inside PyTorch and you cannot execute nccl --version as a command in your terminal.

you can use something like:

import torch
torch.cuda.nccl.version() # for me gives → (2, 14, 3)

to get the nccl version being used by torch.

@prt I don’t run nccl in the terminal, it just says that a certain file is not used and no matter how much I install pytorch, it can’t find this file in the desired path. Can I fix the problem if I manually install pytorch from githup?

Tried and tested.The problem is that some files are missing.