PyTorch Forums
ncclInvalidUsage of torch.nn.parallel.DistributedDataParallel
fangwei123456
(Fangwei123456)
November 1, 2021, 9:33am
8
I sove this problem by change
net.to(f'cuda:{args.local_rank}')
1 Like
Properly implementing DDP in training loop with cleanup, barrier, and its expected output
show post in topic