Pytorch, CUDA, and NCCL

I’d like to upgrade NCCL on my system to 2.10.3; it supports bfloat16, which I’d like to use. I don’t know the dependency relationships among Pytorch, CUDA, and NCCL. Does Pytorch have bindings to particular versions of NCCL, as suggested by this issue? Can I choose to use a newer version of NCCL without upgrading either Pytorch or CUDA?

The PyTorch binaries ship with a statically linked NCCL using the NCCL submodule. The current CUDA11.3 nightly binary uses NCCL 2.10.3 already, so you could use it.
On the other hand, if you want to use a specific NCCL version, which isn’t shipped in a binary release, you could build from source and use your locally installed NCCL via:

NCCL_INCLUDE_DIR="/usr/include/" \
    NCCL_LIB_DIR="/usr/lib/" \
    python install
Thanks. Do you know when the current CUDA 11.3 nightly will become official?

We are currently targeting PyTorch 1.10.0 as the stable release using the CUDA11.3 runtime.

Thank you, ptrblock.