NCCL cannot find libnccl-net.so and running nccl --version does not give any output, what could be the reason.
CUDA VERSİON = 12.0
PYTHON VERSİON = 3.8
TORCH VERSİON = 2.0.1
NCCL cannot find libnccl-net.so and running nccl --version does not give any output, what could be the reason.
CUDA VERSİON = 12.0
PYTHON VERSİON = 3.8
TORCH VERSİON = 2.0.1
You might want to install it in your setup then. More information about net plugins can be found here.
nccl
is not a binary command so unsure what exactly you are trying to run. NCCL is a library used inside PyTorch and you cannot execute nccl --version
as a command in your terminal.
you can use something like:
import torch
torch.cuda.nccl.version() # for me gives → (2, 14, 3)
to get the nccl version being used by torch.
@prt I don’t run nccl in the terminal, it just says that a certain file is not used and no matter how much I install pytorch, it can’t find this file in the desired path. Can I fix the problem if I manually install pytorch from githup?
Tried and tested.The problem is that some files are missing.