Is there any easy way to access number of samples that are returned by specific process’ dataloader (which is distributed for multi-gpu training)?
I am training model using
torch.dist with multiple GPUs and need to get number of examples in dataloader (that is distributed using
torch.utils.data.distributed.DistributedSampler) per rank. I can’t use
len(dataloader) because I will get number of batches per rank and when I use
len(dataloader.dataset) I will get size of whole dataset (so number of examples that are in total among all ranks).