I am running DistributedDataParallel and both with nccl and gloo backends on Pytorch 1.1 my code freezes on the following line:
model = torch.nn.parallel.DistributedDataParallel(model)
I am not using built-in dataloader so problem with workers not being set to 0 should not be a problem.
Is there any other place where this problem might be coming from?