RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'

Hello,

I’m getting “RuntimeError: Expected a ‘cuda’ device type for generator but found ‘cpu’” error when I try to iterate over my dataloader created as follows:

transform=transforms.Compose([
                               transforms.Resize(image_size),
                               transforms.CenterCrop(image_size),
                               transforms.ToTensor(),
                               transforms.Normalize(0.5, 0.5),
                           ])
dataset = dset.MNIST(root=dataroot, train=True, download=True, transform=transform)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=batch_size,
                                         shuffle=True, num_workers=workers)

The problem solves if I turn shuffle to False but I would like to mantain it to True. Solutions I’ve found imply to change some pytorch code but I would like to avoid it.

Thanks!

2 Likes

Hi,
Can you please see if this works -

torch.utils.data.DataLoader(
    ...,
    generator=torch.Generator(device='cuda'),
)
2 Likes

It works!

Thank you very much! :slight_smile:

I don’t understand what the issue in the posted code snippet was as it’s working for me locally and it’s not even using the GPU, so which part of the code raised the error?

@srishti-git1110 have you seen this error before being raised in a DataLoader using the CPU only?

I’m sorry I didn’t explain myself properly. The problem only arised when using GPU as device and worked fine when working on CPU.

Hi @ptrblck ,
No, I’ve never run into such an error while using CPU.
Consequently, yes, the posted code doesn’t seem to produce any error.

I just assumed the OP is facing the error when using GPU, hence posted that as the solution. Sorry for not being clear - should’ve mentioned it there.

Not at all. My post wasn’t any criticism as you’ve guessed it perfectly right and @Jorge_Garcia clarified that indeed the GPU was used.

I was just concerned if this might be a known issue of raising CUDA errors when a CPU-only DataLoader is used, but it turns out the code was missing some parts. :wink:

1 Like

Hello here. I faced the same issue. But looks like the suggestion above will not work:

/local_disk0/.ephemeral_nfs/envs/pythonEnv-4aa41058-b8da-44ab-8b2d-453f51f9b1ec/lib/python3.8/site-packages/torch/utils/data/dataloader.py in __init__(self, loader)
    575             shared_rng = torch.Generator()
    576             shared_rng.manual_seed(self._shared_seed)
--> 577             self._dataset = torch.utils.data.graph_settings.apply_random_seed(self._dataset, shared_rng)
    578         self._dataset_kind = loader._dataset_kind
    579         self._IterableDataset_len_called = loader._IterableDataset_len_called

/local_disk0/.ephemeral_nfs/envs/pythonEnv-4aa41058-b8da-44ab-8b2d-453f51f9b1ec/lib/python3.8/site-packages/torch/utils/data/graph_settings.py in apply_random_seed(datapipe, rng)
    151 
    152     for pipe in random_datapipes:
--> 153         random_seed = int(torch.empty((), dtype=torch.int64).random_(generator=rng).item())
    154         pipe.set_seed(random_seed)
    155 

/local_disk0/.ephemeral_nfs/envs/pythonEnv-4aa41058-b8da-44ab-8b2d-453f51f9b1ec/lib/python3.8/site-packages/torch/utils/_device.py in __torch_function__(self, func, types, args, kwargs)
     60         if func in _device_constructors() and kwargs.get('device') is None:
     61             kwargs['device'] = self.device
---> 62         return func(*args, **kwargs)
     63 
     64 # NB: This is directly called from C++ in torch/csrc/Device.cpp

RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'

Basically here you can see that the generator is initilised inside function right before generating stuff

453f51f9b1ec/lib/python3.8/site-packages/torch/utils/data/dataloader.py in __init__(self, loader)
    575             shared_rng = torch.Generator()
    576             shared_rng.manual_seed(self._shared_seed)
--> 577             self._dataset = torch.utils.data.graph_settings.apply_random_seed(self._dataset, shared_rng)

In my case it looks more like a bug in the back-compatibility of data pipes from torchdata with DataLoader

same problem, it seems the generator will always be initialized using cpu

Alright.
If you define loader like that:

    dataloader = DataLoader(
        image_dataset,
        batch_size=batch_size,
        num_workers=num_workers,
        pin_memory=True,
        shuffle=True,
        generator=torch.Generator(device='cuda:0'),
    )

And try to use next(iter(dataloader))
You get Error:

/usr/local/lib/python3.10/dist-packages/torch/utils/data/sampler.py in __iter__(self)
    163         else:
    164             for _ in range(self.num_samples // n):
--> 165                 yield from map(int, torch.randperm(n, generator=generator).numpy())
    166             yield from map(int, torch.randperm(n, generator=generator)[:self.num_samples % n].numpy())
    167 

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Seems like instead of numpy there should be .cpu() method in sampler.py

Otherwise, if we don’t specify generator or use ‘cpu’ device it produces error (which is the topic of discussion).

So, what should I do to solve the problem?

2 Likes
/usr/local/lib/python3.10/dist-packages/torch/utils/data/sampler.py in __iter__(self)
    163         else:
    164             for _ in range(self.num_samples // n):
--> 165                 yield from map(int, torch.randperm(n, generator=generator).numpy())
    166             yield from map(int, torch.randperm(n, generator=generator)[:self.num_samples % n].numpy())
    167 

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

I’m having the same error, did you manage to find a solution?

Thank you very much!!!