Torch distributed data-parallel vs Apex distributed data-parallel

Thanks for your reply. I have solved this problem. It is caused that I have run a partial dataloader only in local_rank=0 for a temporary evaluation. It seems that all dataloaders in different processes must be in the same state.