RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument index in method wrapper__index_select) while using Dataparallel class

ok yes I thought so too, but it feel like to be some sort of a bug for this class, eventually could solve it by completely abandoning the nn.DataParallel and using nn.parallel.DistributedDataParallel instaed. It was little bit painful and not as easy as DataParallel, but happy that did it: nn.parallel.DistributedDataParallel seems to be more concise.

2 Likes