Some of weight/gradient/input tensors are located on different GPUs

I get the following error, while using transformers
Some of weight/gradient/input tensors are located on different GPUs. Please move them to a single one. at /pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu:24
Any ideas on how to fix it?
Thanks,
Sankar

As the error explains, some tensors are not on the same device and the some operations thus cannot be executed.
Feel free to post some code so that we can help any further. :wink: