Invalid gradient at index 0 - expected shape[] but got [1]

noone · September 29, 2018, 6:12am

I’ve been around this problem for the whole day.

torch.autograd.backward(loss_seq, grad_seq) will get an error.

Error log(Pytorch 0.4.1):

Traceback (most recent call last):
  File "train_vgg.py", line 272, in <module>
    torch.autograd.backward(loss_seq, grad_seq)
  File "/root/anaconda3/lib/python3.6/site- 
packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: invalid gradient at index 0 - expected shape [] but got [1]

Input:

loss_seq:[tensor(7.3761, device='cuda:1', grad_fn=<ThAddBackward>), tensor(4.3005, device='cuda:1', grad_fn=<ThAddBackward>), tensor(4.2209, device='cuda:1', grad_fn=<ThAddBackward>)]
grad_seq:[tensor([1.], device='cuda:1'), tensor([1.], device='cuda:1'), tensor([1.], device='cuda:1')]

Relete code:

images = Variable(images).cuda(gpu)

label_yaw = Variable(labels[:,0]).cuda(gpu)
label_pitch = Variable(labels[:,1]).cuda(gpu)
label_roll = Variable(labels[:,2]).cuda(gpu)

pre_yaw, pre_pitch, pre_roll = model(images)

# Cross entropy loss
loss_yaw = criterion(pre_yaw, label_yaw)
loss_pitch = criterion(pre_pitch, label_pitch)
loss_roll = criterion(pre_roll, label_roll)

loss_yaw += 0.005 * loss_reg_yaw
loss_pitch += 0.005 * loss_reg_pitch
loss_roll += 0.005 * loss_reg_roll

loss_seq = [loss_yaw, loss_pitch, loss_roll]
grad_seq = [torch.ones(1).cuda(gpu) for _ in range(len(loss_seq))]

# crash here
torch.autograd.backward(loss_seq, grad_seq)

Can someone tell how to fix it?

InnovArul · September 29, 2018, 8:36am

grad_seq = [torch.tensor(1).cuda(gpu) for _ in range(len(loss_seq))]

noone · September 29, 2018, 9:32am

Thank you for your reply. I changed to your code, but get a another error:

Traceback (most recent call last):
  File "train_vgg.py", line 363, in <module>
    torch.autograd.backward(loss_seq, grad_seq)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: invalid gradient at index 0 - expected type torch.cuda.FloatTensor but got torch.cuda.LongTensor

So. I change code like this:grad_seq = [torch.FloatTensor(1).cuda(gpu) for _ in range(len(loss_seq))], but I got a same error:

Traceback (most recent call last):
  File "train_vgg.py", line 272, in <module>
    torch.autograd.backward(loss_seq, grad_seq)
  File "/root/anaconda3/lib/python3.6/site- 
packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: invalid gradient at index 0 - expected shape [] but got [1]

InnovArul · September 29, 2018, 9:35am

use

grad_seq = [torch.tensor(1.0).cuda(gpu) for _ in range(len(loss_seq))]

The whole idea is to use torch.tensor to create a zero-dimensional tensor containing 1.0.

noone · September 29, 2018, 9:39am

Thanks. It’s work now.