PyTorch Forums
In a multi-GPU DDP environment, if the loss on one rank is NaN while the others are normal, could this cause the all-reduce to hang?
distributed
ptrblck
November 12, 2025, 7:31pm
2
No, as answered in your
cross post
.
show post in topic