In the multi-GPU dataloader, if I set drop_last=False and the last batch of data cannot be evenly distributed to each GPU, what will PyTorch do?
The DistributedSampler
will drop the tail of the samples as seen here.
1 Like
I understand, thank you!