nccl_reduce_bucket_size
is set to a fixed 256MB, is there any reason that this is not a larger number? Increasing this seems like will make reduction faster.
nccl_reduce_bucket_size
is set to a fixed 256MB, is there any reason that this is not a larger number? Increasing this seems like will make reduction faster.