Runtime Error: Function MulBackward0 returned an invalid gradient at index 1 - expected type torch.FloatTensor but got torch.cuda.FloatTensor

MariosOreo · November 10, 2019, 3:11am

Hi there,

In the training phase, then I run loss.backward(), it raised this error:

  File "/home/jingweipeng/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/tensor.py", line 107, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/jingweipeng/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/autograd/__init__.py", line 93, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: Function MulBackward0 returned an invalid gradient at index 1 - expected type torch.FloatTensor but got torch.cuda.FloatTensor

I have checked device attribute of all parameters, using the code snippet like this.
The Loss Function is simple nn.CrossEntropyLoss().
How can I trace such an error, using hooks? Could you give me some hints or advice? Thanks in advance!

MariosOreo · November 11, 2019, 3:26am

I think there must be some tensors on CPU and some tensors on GPU, right?

How can I find such invalid gradient

MariosOreo · November 11, 2019, 7:17am

I have solved this error. Some variables initialized in forward were not put deployed on cuda.

Hari_Krishnan · March 7, 2020, 8:46am

Hi,

I’m getting the same error, can you share how you found out which variables were not deployed on cuda. I’m using a large model , so it is difficult for me to print out the type of every model and check. Is there any other way to go about it?