Expected tensors to be on one device, but found two: cpu and cuda:0

Hey guys. I am running code from another repository and ran into the above error. I have looked into it online, and none of the solutions I have found work for me. I have already verified that the model is on cuda:0, so it is the data objects that are lingering on the CPU.

The dataloader is obtained from a dataset as follows:

eval_dataloader = self.get_eval_dataloader(eval_dataset) #in a class inheriting from Seq2SeqTrainer

Later, the code is run as follows:

output = eval_loop(
        eval_dataloader,
        description="Evaluation",
        prediction_loss_only=True if compute_metrics is None else None,
        ignore_keys=ignore_keys,
)

Eval loop here is get_evaluation_loop from the transformers trainer class. So the point is, because of the way this is run I don’t think I can move the batches to the GPU just before training on them; it is run through the eval_loop method rather than through a modifiable for loop.

From here, I have tried many methods to move the dataloader to the GPU, including:

for batch in eval_dataloader:
for k, v in batch.items():
        v = v.to(device)
        batch[k] = v

But, for whatever reason, I still get this error. Any suggestions as to how I can fix this? Thanks!

In this case I would recommend checking the trainer class code to see where the data is supposedly moved to the device, as the error seems to come from the higher-level API you are using.