How to solve error: 'All input tensors must be on the same device. Received cuda:0 and cpu'

Prats · July 10, 2021, 7:55am

code is:

syn_dataset = train_dataset.create_syn_dataset()
final_dataset = ZSLDataset('awa1', n_train, n_test, train_agent,
        gzsl=gzsl, train=True, synthetic=True, syn_dataset=syn_dataset)
final_train_generator = DataLoader(final_dataset, **params)

model_name = "awa1_final_classifier"
success = train_agent.load_model(model=model_name)
if success:
    print("\nFinal classifier parameters loaded....")
else:
    print("\nTraining the final classifier on the synthetic dataset...")
    for ep in range(1, n_epochs + 1):
        syn_loss = 0
        for idx, (img, label_attr, label_idx) in enumerate(final_train_generator):
            l = train_agent.fit_final_classifier(img, label_attr, label_idx)
            syn_loss += l
        # print losses on real and synthetic datasets
        print("Loss for epoch: %3d - %.4f" %(ep, syn_loss))

error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-37-12dd6797e446> in <module>
     16     for ep in range(1, n_epochs + 1):
     17         syn_loss = 0
---> 18         for idx, (img, label_attr, label_idx) in enumerate(final_train_generator):
     19             l = train_agent.fit_final_classifier(img, label_attr, label_idx)
     20             syn_loss += l

/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py in __next__(self)
    433         if self._sampler_iter is None:
    434             self._reset()
--> 435         data = self._next_data()
    436         self._num_yielded += 1
    437         if self._dataset_kind == _DatasetKind.Iterable and \

/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _next_data(self)
    473     def _next_data(self):
    474         index = self._next_index()  # may raise StopIteration
--> 475         data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
    476         if self._pin_memory:
    477             data = _utils.pin_memory.pin_memory(data)

/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py in fetch(self, possibly_batched_index)
     45         else:
     46             data = self.dataset[possibly_batched_index]
---> 47         return self.collate_fn(data)

/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py in default_collate(batch)
     81             raise RuntimeError('each element in list of batch should be of equal size')
     82         transposed = zip(*batch)
---> 83         return [default_collate(samples) for samples in transposed]
     84 
     85     raise TypeError(default_collate_err_msg_format.format(elem_type))

/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py in <listcomp>(.0)
     81             raise RuntimeError('each element in list of batch should be of equal size')
     82         transposed = zip(*batch)
---> 83         return [default_collate(samples) for samples in transposed]
     84 
     85     raise TypeError(default_collate_err_msg_format.format(elem_type))

/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py in default_collate(batch)
     53             storage = elem.storage()._new_shared(numel)
     54             out = elem.new(storage)
---> 55         return torch.stack(batch, 0, out=out)
     56     elif elem_type.__module__ == 'numpy' and elem_type.__name__ != 'str_' \
     57             and elem_type.__name__ != 'string_':

RuntimeError: All input tensors must be on the same device. Received cuda:0 and cpu

I am unable to point at the error. I tried running it just on cpu and I got this error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-10-12dd6797e446> in <module>
     16     for ep in range(1, n_epochs + 1):
     17         syn_loss = 0
---> 18         for idx, (img, label_attr, label_idx) in enumerate(final_train_generator):
     19             l = train_agent.fit_final_classifier(img, label_attr, label_idx)
     20             syn_loss += l

/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py in __next__(self)
    433         if self._sampler_iter is None:
    434             self._reset()
--> 435         data = self._next_data()
    436         self._num_yielded += 1
    437         if self._dataset_kind == _DatasetKind.Iterable and \

/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _next_data(self)
    473     def _next_data(self):
    474         index = self._next_index()  # may raise StopIteration
--> 475         data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
    476         if self._pin_memory:
    477             data = _utils.pin_memory.pin_memory(data)

/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py in fetch(self, possibly_batched_index)
     45         else:
     46             data = self.dataset[possibly_batched_index]
---> 47         return self.collate_fn(data)

/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py in default_collate(batch)
     81             raise RuntimeError('each element in list of batch should be of equal size')
     82         transposed = zip(*batch)
---> 83         return [default_collate(samples) for samples in transposed]
     84 
     85     raise TypeError(default_collate_err_msg_format.format(elem_type))

/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py in <listcomp>(.0)
     81             raise RuntimeError('each element in list of batch should be of equal size')
     82         transposed = zip(*batch)
---> 83         return [default_collate(samples) for samples in transposed]
     84 
     85     raise TypeError(default_collate_err_msg_format.format(elem_type))

/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py in default_collate(batch)
     53             storage = elem.storage()._new_shared(numel)
     54             out = elem.new(storage)
---> 55         return torch.stack(batch, 0, out=out)
     56     elif elem_type.__module__ == 'numpy' and elem_type.__name__ != 'str_' \
     57             and elem_type.__name__ != 'string_':

TypeError: expected Tensor as element 1 in argument 0, but got int

Both these errors are arising from the same line in the code ’ for idx, (img, label_attr, label_idx) in enumerate(final_train_generator):’
Thanks.

mma525 · July 10, 2021, 8:33am

Make sure you send all of your input tensors (img, label_attr, label_idx) to GPU. You can send it to your GPU by adding .cuda() (e.g. for img, it will be img.cuda())

Prats · July 10, 2021, 10:09am

Hi, thanks for the reply!
I tried that but it says, SyntaxError: can't assign to function call
Is there any other way?

mma525 · July 10, 2021, 3:21pm

Can you show me your updated code?

Prats · July 10, 2021, 3:46pm

I only added it to the first term, img.to(device) and it gave an error, code:

for idx, (img.to(device), label_attr, label_idx) in enumerate(final_train_generator):

Error:

  File "<ipython-input-9-fff104645c7f>", line 18
    for idx, (img.to(device), label_attr, label_idx) in enumerate(final_train_generator):
            ^
SyntaxError: can't assign to function call

mma525 · July 10, 2021, 4:24pm

There might be some syntax issue since you are applying it directly in the for loop. Try this and see if it works:

---> 18         for idx, (img, label_attr, label_idx) in enumerate(final_train_generator):
     19             l = train_agent.fit_final_classifier(img.cuda(), label_attr.cuda(), label_idx.cuda())
     20             syn_loss += l

Prats · July 10, 2021, 4:31pm

The correction you suggested is inside the for loop but the error (RuntimeError: All input tensors must be on the same device. Received cuda:0 and cpu) is at the for loop statement.
Also, I just tried what you suggested, but it’s throwing the same error.

mma525 · July 10, 2021, 7:02pm

This means that your model has not been sent to GPU. Make sure your model is also on GPU.