What is the local batch size when using DistributedSampler?

When doing data_loader = DataLoader(my_dataset, sampler=DistributedSampler(dataset), batch_size=N) in a DDP distributed training script, what is the number of records each GPU/worker/process/script (unsure what is the most accepted name) receives at each iteration?

Does each DDP GPU receive N records, or N/gpu_count?

The batch size on local workers would be N_global_batch_size//N_workers

thanks! and N_global_batch_size would be the value you set as batch_size in the DataLoader ?

No. You need to manually set batch_size=N_global_batch_size//N_workers in the DataLoader. N_global_batch_size is the real batch size.

1 Like