What’s the meaning of RuntimeError: Expected a 'N2at13CUDAGeneratorE' but found 'PN2at9GeneratorE'
Traceback (most recent call last):
File "train_cosine_iterations_distributed.py", line 335, in <module>
batch_iterator = iter(train_loader)
File "/home/ycg/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 819, in __iter__
return _DataLoaderIter(self)
File "/home/ycg/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 584, in __init__
self._put_indices()
File "/home/ycg/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 646, in _put_indices
indices = next(self.sample_iter, None)
File "/home/ycg/anaconda3/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 160, in __iter__
for idx in self.sampler:
File "/home/ycg/anaconda3/lib/python3.6/site-packages/torch/utils/data/distributed.py", line 45, in __iter__
indices = torch.randperm(len(self.dataset), generator=g).tolist()
RuntimeError: Expected a 'N2at13CUDAGeneratorE' but found 'PN2at9GeneratorE'
def __iter__(self):
# deterministically shuffle based on epoch
g = torch.Generator()
g.manual_seed(self.epoch)
indices = torch.randperm(len(self.dataset), generator=g).tolist()
# add extra samples to make it evenly divisible
indices += indices[:(self.total_size - len(indices))]
assert len(indices) == self.total_size
# subsample
indices = indices[self.rank:self.total_size:self.num_replicas]
assert len(indices) == self.num_samples
return iter(indices)
$ python bug.py
Traceback (most recent call last):
File "bug.py", line 5, in <module>
x.uniform_(-1.0, 1.0, generator=generator)
RuntimeError: Expected a 'N2at13CUDAGeneratorE' but found 'PN2at9GeneratorE'
Same thing happens if I use torch.Generator. By the way, is torch.Generator documented anywhere?
Bro, Have you solve this problem? I meet this error when I use torch.nn.parallel.DistributedDataParallel.
the problem is from sampler = DistributedSampler(dataset) and
data.DataLoader(dataset,…, sampler=sampler)
Here is log.
Traceback (most recent call last):
File “train.py”, line 152, in
sampler=sampler))
File “/home/xdjf/.conda/envs/py36torch041/lib/python3.6/site-packages/torch/utils/data/dataloader.py”, line 501, in iter
return _DataLoaderIter(self)
File “/home/xdjf/.conda/envs/py36torch041/lib/python3.6/site-packages/torch/utils/data/dataloader.py”, line 297, in init
self._put_indices()
File “/home/xdjf/.conda/envs/py36torch041/lib/python3.6/site-packages/torch/utils/data/dataloader.py”, line 345, in _put_indices
indices = next(self.sample_iter, None)
File “/home/xdjf/.conda/envs/py36torch041/lib/python3.6/site-packages/torch/utils/data/sampler.py”, line 138, in iter
for idx in self.sampler:
File “/home/xdjf/.conda/envs/py36torch041/lib/python3.6/site-packages/torch/utils/data/distributed.py”, line 42, in iter
indices = list(torch.randperm(len(self.dataset), generator=g))
RuntimeError: Expected a ‘N2at13CUDAGeneratorE’ but found ‘PN2at9GeneratorE’