I have a data loader that defined as link
train_sampler = torch.utils.data.distributed.DistributedSampler(dataset_train) train_loader = torch.utils.data.DataLoader(dataset_train, batch_size=per_batch_size, shuffle = (train_sampler is None), num_workers=workers, pin_memory=True, sampler=train_sampler, drop_last=DROP_LAST)
During training, I used 4 GPUs and the code likes
train_sampler.set_epoch(epoch) DISP_FREQ = 100 # 100 batch batch = 0 # batch index inputs_from_all_gpus for inputs, labels in tqdm(iter(train_loader)): #process inputs and labels #-------------combine inputs from each GPUs to inputs_from_all_gpus --------------------
My question is how can we combine all
inputs from each gpus to the
inputs_from_all_gpus when all process on distributed is done?